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Thematic Apperception Test: A Tentative 
Appraisal of Some “Signs” of Anxiety 


Gardner Lindzey and Arthur $. Newburg 


Harvard University 


Generalizations concerning the relation be- 
tween aspects of test performance and those at- 
tributes of the empirical world that the test 
intends to predict or diagnose stem from many 
sources. Perhaps the most fertile of these is 
careful observation by sensitive and experi- 
enced clinicians. By contrast, the role of con- 
trolled, empirical study in creating such gen- 
eralizations is relatively slight. When it comes 
to evaluating or instituting a system of checks, 
however, the contribution of controlled investi- 
gation is very great. The free-ranging sensitiv- 
ity of the clinical observer, which makes him 
an ideal generator of new ideas, also promises 
that individual bias and autistic factors will 
play some role in determining his statements. 
Thus, only when the clinician’s generaliza- 
tions have been submitted to test under circum- 
stances designed to eliminate observer bias can 
we place much enduring confidence in them. 


In the present study we selected a small 
number of generalizations concerning the The- 
matic Apperception Test and its sensitivity to 
anxiety and attempted to test these under some 
degree of empirical control. As a result of an 
earlier survey of the literature [3], we had 
available over 500 statements relating aspects 
of TAT response to characteristics of the story- 
teller. From this list we selected 18 general- 
izations concerning anxiety and attempted to 
translate each of these into an objective scor- 
ing system that would permit us to score TAT 
protocols reliably. In addition, we sought to 
secure a reasonable, independent measure of 


1This study is part of a program of research con- 
ducted at the Harvard Psychological Clinic under 
the direction of Professor Henry A. Murray. The re- 
search is supported by grants from the Rockefeller 
Foundation and the Laboratory of Social Relations, 
Harvard University. We are grateful to Shirley 
Shapiro for her assistance in the statistical analysis. 
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anxiety that could be correlated with our TAT 
“signs.” 

Procedure 
Subjects 

The subjects of this study were 20 under- 
graduate males who had volunteered from an 
introductory course in psychology at Harvard 
College to participate in a program of person- 
ality study. They were paid for their time on 
an hourly basis at customary student rates. 
The subjects were selected so as to be hetero- 
geneous in regard to such factors as socioec- 
onomic and ethnic background, academic per- 
formance, and extracurricular activities. 
Thematic A pperception Test 

Administration. The TAT was administered 
under standard conditions with the two ses- 
sions separated by 24 hours or more. In all 
cases the stories were electrically recorded with- 
out the subjects’ knowledge and later trans- 
cribed verbatim. 

Scoring. Our scoring procedures were based 
on statements culled from the literature. In 
describing these we will first present the per- 
tinent quotation, and following this we will 
summarize the variable or variables that were 
designed to correspond to those mentioned in 
the statement. The reader will notice that we 
have not attempted to exhaust the meaning of 
these quotations but have simply selected, some- 
what arbitrarily, certain variables that seemed 
especially pertinent or easy to objectify. In all 
there were 18 signs or variables, presumably 
sensitive to anxiety, that we attempted to 
measure. 

While scoring, the examiner knew only that 
the subjects were male and college students. 
Every variable was scored individually for each 
story, and all stories told to a given card were 
scored before proceeding to the next set of 
stories. With the exception of the second vari- 
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able, all scores were dichotomous as the scorer 
simply indicated for each story that the vari- 
able was present or not present (high or low). 
In some cases, this required setting relatively 
arbitrary cutting points. Thus, for all but the 
second variable, the subject’s score was made 
up of the sum of 20 yes and no scores and con- 
sequently could range from 0 to 20. 


[Anxiety is indicated by the] . . . presence of 
disruptions in the consistency of form charac- 
teristics, especially in organizational elements, 
in vacillation from concept to picture-domin- 
ated responses, and in rejections and refusals; 
content characteristics . . . of oyert aggression, 
depression, mental conflicts or other outstanding 
emotional states .. . [Henry, 2, p. 48]. 

1. Intrapicture vacillation from concept- to pic- 
ture-dominated responses. A concept-dominated re- 
sponse is one whose main theme deals with an idea 
the picture suggests rather than the picture itself. 
It may be abstract, symbolic, or allegorical, and 
often relates the events of the story to the general 
case. A picture-dominated response is one whose 
main theme deals directly with the events of the 
picture itself. Often past and future action is omitted, 
and in general the story action takes place in the 
here and now of the picture. At its most restricted 
level the response may be only a description of the 
picture with little or no plot. If during a single 
story there was one or more shifts from one type of 
response to the other, the story was scored “vacilla- 
tion.” 

2. Interpicture vacillation from concept- to pic- 
ture-dominated responses. Each story was scored as 
concept, picture, or vacillation as defined for the 
preceding variable. Then the number of changes 
from concept to picture domination and vice versa 
in the 20 stories was counted. 

3. Rejections and refusals. The subject makes a 
negative comment about the picture, or his ability 
to create a story from the picture. He contradicts 
or denies or reverses part of his story. He refuses to 
tell or finish a story. 

4. Outstanding emotional states. The characters 
show strong and persistent overt aggression, depres- 
sion, mental conflict, fear, extreme happiness, love, 
worry, etc. 


. Phantasies wich marked anxiety were 
“characterized by moving, dramatic situations 
and intense, comparatively clearcut conflicts” 
[p. 75]. ... A high incidence of verbs denotes 
a kinetic release in the phantasy of anxious 
tensions in the narrator [p. 77]. . - Total 
number of verbs/total number of adjectives. 
High values connote restless, forceful, dramatic 
action in the phantasies, expressing libidinal 
tensions and anxiety in the subject [p. 79]. 
... the phantasies in an anxiety state are brief; 
the action is most dramatic (highest verb-ad- 
jective quotient) and often compulsive; alter- 
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natives of conation are most frequently sought; 
special expressions connoting vagueness, hesi- 
tation, and trepidation are freely used; and 
direct identification of the narrator with char- 
acters in his phantasy frequently occur . 

phantasied situations are left as unresolved as 
the underlying emotional conflicts of the sub- 
ject [pp. 80-81] [Balken and Masserman, 1]. 

5. Intense conflicts. The conflict may be intra- or 
interpersonal. It should involve violent emotional 
states and if it is intrapersonal the individual should 
be seen as “torn” by the two sides of the conflict. 

6. Vagueness and hesitation. The subject makes 
two or more statements showing either vagueness 
or hesitation or both, e.g., “I’m not sure,” “I don’t 
know,” “I can’t tell.” 

7. Self-identification. The narrator openly admits 
that one of the characters resembles or acts like him- 
self. He indicates that part or all of the story is 
taken from his own life. 


8. Unresolved conflicts. The story has no out- 
come. Conflicts, situations, plots are left unfinished. 
Statements like “I don’t know how this comes out,” 
“I don’t know what comes after this” are common. 
Or, conflict situations are seen as remaining so, ¢.g., 
“And he never forgives her for that,” “And this 
couple continues to quarrel.” 


9. Interjections. The subject makes two or more 
statements not directly related to the story he is 
telling. These may be in the form of comments about 
the picture: “My God, what’s that?” “I'll never be 
able to get anything out of that.” Or he may break 
into his story to ask the experimenter a question: 
“Is this the kind of story you want?” “Haven't I 
seen this picture before somewhere?” Or he may 
break the continuity of his story by suddenly com- 
menting on his style, delivery, plot, etc.: “This is 
quite an allegory, isn’t it?” 

10. Verb/adjective quotient, First the total num- 
ber of verb forms was counted. The whole predicate 
was counted as one verb. Then the total of verbs 
was divided by the total number of adjectives and a 
ratio of 3.5 or better was considered high and all 
other ratios low. With this variable, and the fol- 
lowing variable, we used a random number table 
to select three stories from each set of protocols and 
used these stories as unbiased estimates of the ac- 
tual quotient for the entire 20 stories. 


11. Number of adjectives per 100 words. The 
number of adjectives was computed in the same 
manner as in the previous variable. This figure was 
divided by the total number of words and a ratio of 
-12 or greater scored high and all ratios less than 
this scored low. Again the sampling technique de- 
scribed above was used. 


12. Briefness. The total number of words in the 
story was counted. Any story with less than 175 
words was classed as brief and all others were con- 
sidered long. 


If the testing situation itself significantly raises 
the level of anxiety, a not uncommon sequel is 
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invariance of stories on the level of object de- 
scription: “The violin looks like a Stradivari- 
us” [pp. 57-58]. Wishes may also be reactivated 
by memory. Individuals who suffer grief, shame 
or anxiety may tell such stories. In our 
experience the sequence of memory-wish has us- 
ually signified serious disturbance of the inner 
life [p. 64]. Only when a protocol is invariant 
with respect to the level on which the stories 
proceed may we assume that this level is a liter- 
al representation of the predominant level on 
which the individual functions. An example of 
such invariance is the individual whose anxiety 
is reflected in his stories by exclusive reference 
to the level of feeling and expectations [p. 103] 
[ Tomkins, 7]. 

13. Invariance on the level of object description. 
With the exception of opening and concluding sen- 
tences such as “ ‘Lemme’ see,” “Well, I guess that’s 
about it,” every scorable sentence must be restricted 
to the simple, physical description of the objects or 
people in the picture. Neither the feelings nor 
thoughts of any of the people in the story may be 
present in any sentence. 


14. Invariance on the level of feeling and expec- 
tation. With the exception of the beginning and 
ending phrases noted in the previous variable, every 
sentence must contain affect. It is not necessary that 
the kind or quality of emotion expressed be the same. 
The narrator must attribute emotion or expectation 
throughout the story. “I feel that...,” “He is wor- 
ried (afraid, in love, fears, hates, longs for, etc.).” 
A single sentence in which this was absent scored 
the story as negative. 

15. The sequence of memory followed by a wish. 
Either the narrator or one of the characters recalls 
material from the past and this stimulates him to 
express a desire for something now. A sentence, 
clause, or phrase containing a verb of remember- 
ing is followed by a passage containing a verb of 
wishing or desiring, e.g., “The old man recalls his 
wife fondly; he wishes she were still alive.” 


No suggestive diagnostic features other than 
sporadic blocking, flurries of anxiety in the 
course of telling the stories, and frequent themes 
of apprehensiveness are to be expected; even 
these do not occur frequently [Schafer, 6, p. 
46}. 

16. Blocking. The subject’s response to the picture 
is blocked; he is incapable of responding quickly 
and freely to the stimulus. A long pause before he 
responds, or two or more pauses during the story, 
indicates this. Any statement he makes which has a 
content indicating that he feels unable to produce a 
story, or that producing one would be difficult, was 
scored. Statements such as “I can’t get any further,” 
or “I seem to be stuck,” are examples. 

17. Apprehensiveness. The presence of themes in 
which there is an anticipation of future evil or mis- 
fortune. The narrator or one of the characters in 
the story has feelings of dread or foreboding about 
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the future. He has an expectation that unpleasant- 
ness lies ahead. “Things don’t look too bright for 
the future.” “I’m afraid the situation gets worse for 
these people.” 
Anxiety neurotics had many plots emphasizing 
sudden physical accidents and mental traumas 
such as loss of wife, mother, sweetheart or jobs, 
house burning down, stock maract crash, etc., 
thus reflecting their own fears and the insta- 
bility of their world [Rotter, 5, p. 30). 
18. Trauma. The content of the story deals with 
a theme in which: (a) one of the characters has a 
sudden physical accident, e.g., “and then he fell and 
broke his leg,” ‘ 
on her’; (>) 


. when suddenly a rock dropped 
someone undergoes a severe mental 
trauma, characterized by its lasting effects. 

Reliability. In estimating the reliability of 
the TAT scoring we selected at random 100 
stories, and a reliability rater scored these for 
each of the variables. These ratings agreed 
with the experimental ratings in 82% of the 
cases, indicating a relatively high degree of as- 
sociation between the two sets of ratings. 
Criterion Measure 

The primary independent measure was a 
“diagnostic council” rating based upon a very 
wide variety of information derived from ob- 
servation, self-report, situational tests, and a 
large number of indirect measures. After pre- 
liminary analyses, this information was pre- 
sented individually for each subject and dis- 
cussed in a council of clinical psychologists. 
During this session the group assigned a tenta- 
tive rating to each subject to indicate the im- 
portance of anxiety as a determinant of his 
behavior. Later all the subjects were considered 
jointly and ranked in terms of the degree or 
importance of anxiety. In addition to the diag- 
nostic council ratings, scores were available 
for the Psychosomatic Inventory which pre- 
sumably should bear some relation to anxiety. 
We obtained a rank-order correlation of .69 
between the total score of the inventory and 
the clinical ranking. 
Statistical Analysis 


In most of our analyses we relied upon rank- 
order correlation as a statistic suited to the 
relative crudeness of our data. In attempting 
to relate our “signs” to the subscores of the 
Psychosomatic Inventory, we used a technique 
described by Mosteller [4] for estimating cor- 
relation coefficients. This short-cut method was 
used as these relations were of relatively slight 
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Table 1 


Relation of TAT “Signs” of Anxiety to Criterion Measures 











Psychosomatic inventory 














m : Clinical Total Psycho- Gastro- Nervous Worry Fear 
TAT variable ratings score somatic intestinal tension 
1. Intrapicture 05+ —.32+ —38 +t -49 + 21 + —28.5¢ —i1f¢ 
vacillation 
2. Interpicture — .38 — .04 —14 —19.5 - 8.5 + 5.5 — 4.5 
vacillation 
3. Rejection and 30 .40* + 33.5 +13.5 +-26.5 +27.5 + 3 
refusal 
4. Outstanding — .06 — .02 +-10 1.5 — 1 — 37.5 — 7 
emotional 
states 
5. Intense .03 06 + 8.5 t 3.5 9.5 — 30 —10.5 
conflicts 
6. Vagueness 40* mh + 7.5 — 49 10 +29.5 + 18.5 
7. Direct identi- 39° 09 — 1.5 — § + 7 — 1.5 0 
fication 
8. Conflict un- 22 — .02 — 5.5 —28 1 —29 — 1.5 
resolved 
9. Interjections .06 21 +17.5 + 9 25.5 + 9.5 +11 
10. Verb/adjective .09 —.13 — 9 — 31.5 — 23 + 2 — 8 
quotient 
11. Adjectives per 13 .09 + 7.5 +-33 11.5 — § —1 
100 words 
12. Briefness 15 31 +36 +51 +10 +26 + 9 
13. Object .38* 53® +43 + 2 + 37.5 + 32.5 +35 
description 
14. Feeling and 22 .02 +19.5 + 3 = 1.5 — 9 —22.5 
expectation 
description 
15. Sequence: —.16 — .09 —11 — 22.5 —21.5 —29.5 — 18.5 
memory to 
wish 
16. Blocking 33 .20 +20.5 +11 + 5.5 + 2.5 + 1.5 
17. Apprehensive- 05 — .04 — 8 — 1 — 2 —33 —11.5 
ness 
18. Trauma —.19 10 + 7.5 +22 +23 + 1.5 — 4 





* Significant at 5% level with one-tailed test. 

+ Rank-order correlation. 

t Mosteller’s correlation approximation. 
interest, and indicating the direction of the re- 
lation and the existence of strong relationships 
seemed sufficient. A perfect correlation would 
be signified by a score of 96 using this tech- 
nique. 

Results and Discussion 


Our major results are summarized in Table 
1 where we find that, of the 18 TAT signs, 13 
show some trend toward positive association 
with the clinical rating of anxiety. Three of 
the positive correlations (vagueness and hesita- 
tion, self-identification, object description) are 
significant at the 5% level, and there is one 
negative correlation (interpicture vacillation 
from concept- to picture-dominated response) 


of approximately the same magnitude. These 
findings are not encouraging so far as the 
utility of the particular TAT measures em- 
ployed here are concerned. Although there is 
some evidence of a tendency toward association 
between the TAT variables and our criterion 
measure, as witnessed by the preponderance of 
positive correlations, this seems a very tentative 
link, and the existence of relatively high nega- 
tive correlations suggests dramatically how far 
we are from an adequate formulation of how 
to measure anxiety with this instrument. 

The correlations between the TAT signs 
and the Psychosomatic Inventory are also sum- 
marized in Table 1. In general, they conform 
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Table 2 
Rho Intercorrelation of TAT “Signs” of Anxiety Tims 
e » 5 E 8 2 £ % & ~ 2 E 
TAT variable SeScge = om: ® Ter fe a EES z 
SSESSes ep SSS SHE Tes Pt meee YS 
= = = on feo-s © wa & sc «4 
G8 6.8 5 as 2oceu 2.2 = S32 E $65 sn 8 
safe tsestaats fe SeaeS SERGE Fc 
SSSSVESssssss SEE SESE E SEkse a SF 
m Pm PME Sn Sead DEK PEATE mM OMS ASD MA FS 
Intrapicture 
vacillation 
Interpicture -.27° 
vacillation 
Rejections and -.28 -.10 
refusals 
Emotional 25 -.03 -.18 
states 
Intense 25 —.25 —13 .85 
conflicts 
Vagueness and 14-52 .53 -.41 -.25 
hesitation 
Self-identification 25 -.20 .52-.19 .00 .33 
Unresolved 60 -.20 -.03 41 .53 -.04 .38 
conflicts 
Interjections -.04 -20 .75 .03 .21 .56 .19 -.01 
Verb/adjective -—18 .21 -40 .36 .25 -.30 —.36 -.05 -—.27 
quotient 
Number of adjec- 25-11 .37 -.36 -—.33 .21 42 -.04 .16 -—.89 
tives 
Briefness -71 .26 .35 —37 -.50 .02 -.12 -.33 -.02 -—02 .10 
Object description -10 .26 .54-32 -—37 .18 47 .02 .11 -.30 .32 .30 
Feeling and -26 .06 .21 -11 ~15 .08 .12 .15 .03 —24 12 41 .14 
expectation 
Memory followed -.03 -06 .03 .39 .34-03 -—17 .16 .37 .08 -—.28 -.09 -.09 .25 
by wish 
Blocking 07 -48 .75 -—29 .09 .78 .32 .16 .59-—27 .18 .18 .20 .10 -.10 
Apprehensiveness 19 -.05 —18 .96 .83 -—31 —22 .38 .13 .31 -.30 —37 -—43 -.12 .36 -.07 
Trauma -02 .23-12 .63 45 -.39 -28 .00 .17 .12 .05 —11 —12 -—.39 17 -.17 .69 





* A rho of .44 is required for significance at the 5% bevel. 


to the tendencies revealed in the clinical rating 
correlations. Especially for the total score and 
the psychosomatic subscore the direction of the 
correlations is quite similar to those we have 
already discussed, although the magnitude is 
generally less. In view of the presumed lesser 
sensitivity of this paper-and-pencil instrument, 
these findings are to be expected. 

In ‘Table 2 we present the intercorrelations 
between the TAT signs. If we exclude the 
first two variables, which appear to have been 
reversed in their scoring, the general tendency 
is for low positive correlations. In general this 
table of intercorrelations does not suggest a set 
of indices for the same or very similar variables. 
If some of these signs are in fact highly indic- 


ative of anxiety, then it seems evident that 
others are not very sensitive to this variable. 
Nevertheless, several interesting clusters ap- 
pear. In one we find the following variables: 
strong emotional states, intense conflicts, ap- 
prehensiveness, and trauma, while in the other 
we find: rejections, hesitations and vagueness, 
interjections, and blocking. The question of 
how much these represent related psychological 
variables and how much the association is arti- 
ficially produced by scoring the same story at- 
tributes under multiple headings is not clearly 
answered by our data. Certainly an examina- 
tion of the definitions of these eight variables 
suggests that some of these intercorrelations 
are the results of multiple scoring of the same 
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aspects of the stories. 


Speculation concerning the relatively strong 
negative correlation between our measure of 
interpicture vacillation and our criterion mea- 
sure of anxiety suggested that a variable 
of rigidity or inflexibilty might be mediat- 
ing this relationship. A low picture-vacilla- 
tion score implies that the subject adopts an 
initial set toward the story construction proc- 
ess and maintains this without variation 
throughout the series of pictures. The widely 
observed “freezing” effect of anxiety makes it 
understandable that the individual high in anxi- 
ety might be relatively invariant in his approach 
to storytelling. Given this initial hypothesis, we 
looked for other variables that could be seen 
as related in some way to rigidity, lack of 
spontaneity, perseverative tendencies, or inflex- 
ibility. We then set up new ranks based upon 
the pooled scores from these variables. First 
we combined scores from the interpicture vac- 
illation and intrapicture vacillation variables, 
as these measures seemed obviously to be get- 
ting at related material in spite of their low 
intercorrelation. For these two variables we 
reversed the direction of our original scoring 
because of the observed negative correlation 
and our post hoc rationale. The resulting cor- 
relation (.47) was somewhat better than the 
.38 correlation that our single measure had 
given us. Next we included scores for the 
tendency to simply describe objects or aspects 
of the stimulus picture. This combined rank 
showed a correlation of .56 with our criterion. 
Following this we developed a rank order 
based on these three variables plus scores for 
the tendency of the storyteller to specifically 
identify part of the story as coming from his 
own life or one of the characters as resembling 
himself. This last variable suggests a concrete- 
ness and lack of spontaneity that goes with 
the other variables we have already mentioned. 
The rank order based upon these variables 
correlated with our criterion measure .58. 
Finally, we introduced scores for vagueness or 
hesitation of the storyteller, reasoning that this 
indecisiveness was related to the lack of spon- 
taneity we have already commented upon. 
This ranking showed a .63 correlation with 
the criterion measure. 


The fact that through the use of three vari- 
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ables we were able to secure a correlation of .56 
with our criterion and that with the addition 
of two more variables we could raise this to 
.63 suggests that potentially the TAT may be 
a relatively sensitive indicator of anxiety. On 
the other hand, in evaluating these findings it 
is important to consider the small size of our 
sample and its composition. Our subjects all 
met some criteria of normality although at 
least four of them presented rather serious ad- 
justment problems. Even these four, when 
compared to severely disturbed patients, prob- 
ably could be characterized as only moderately 
anxious. Thus our results are based upon a re- 
stricted range of the variable of interest. 

The relative success of our formal measures 
in predicting anxiety when compared to the 
content measures suggests that analysts of the 
TAT might well devote more of their time 
to this type of variable. It is, of course, pos- 
sible that this finding may be rather specif- 
ically linked to the nature of the variable un- 
der study. 


In summary, our results are somewhat dis- 
couraging for the utility of common generali- 
zations concerning anxiety and the TAT. 
Nevertheless the fact that we were able to 
find relatively substantial correlations between 
our criterion measure of anxiety and small 
clusters of I-AT variables that bore some ra- 
tional relation to each other suggests the po- 
tential utility of the instrument in this area. 
If we take seriously the highly tentative evi- 
dence of this study, it seems that the TAT of 
the anxious person is characterized by an ex- 
cessive sameness or rigidity in the approach of 
the storyteller, a preference for limiting him- 
self to simple object description, a readiness 
to relate parts or characters in the stories ex- 
plicity to himself, a tendency to be vague or 
hesitant in presenting his stories, and also a 
readiness to reject his productions and refuse 
to tell or complete his stories. 


Received April 1, 1954. 
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Brief Intellectual Assessment of Patients with 
Behavior Disorders ”’ 
Herbert W. Eber, Carl M. Cochrane 


University of North Carolina 


and Albert A. Branca 


VA Hospital, Fayetteville, North Carolina 


In treatment-oriented situations differentia- 
tion between essentially normal and very low 
intelligence is often sufficient. ‘The Wechsler- 
Bellevue requires lengthy administration time 
disproportionate to the information required in 
these cases. The SRA Non-Verbal Form [1], 
a ten-minute test (plus one minute for scoring) 
with well-standardized norms and a reliability 
of .85, does not sample intelligent behavior 
broadly enough to confirm very Jow intelli- 
gence; in this study, it was evaluated for its 
ability to rule out very low intelligence. 

As the use of a normal population would 
have spuriously raised the validity of the 
technique, the SRA was here evaluated in the 
setting for which it is recommended. The sub- 
jects of the experiment were all the patients 
without brain damage, seen during a six-month 
interval in the NP service of this facility, in 
whose cases the ruling out of very low intelli- 
gence was a real problem. This group of 32 pa- 
tients was modal around the point of differen- 
tiation, and represented a real challenge to the 
technique. The mean Wechsler IQ of the group 
was 86.9 and the SD was 17.6. Both the 
Wechsler (Form I or II) and SRA were 
administered to each patient individually. 


For purposes of comparison the scores on 
both tests were converted into percentiles. 
These were classified according to the method 


1From the Veterans Administration Hospital, Fay- 
etteville, North Carolina. 


2An extended report of this study may be obtained 
without charge from Albert A. Branca, Chief, Clin- 
ical Psychology, VA Hospital, Fayetteville, North 
Carolina, or for a fee from the American Documen- 
tation Institute. To obtain it from the latter source, 
order Document No. 4344 from ADI Auxiliary Pub- 
lications Project, Photoduplication Service, Library 
of Congress, Washington 25, D.C., remitting in ad- 
vance $1.25 for microfilm or $1.25 for photocopies. 
Make checks payable to Chief, Photoduplication 
Service, Library of Congress. 


Wechsler used in his statistical definition of 
intellectual levels. His cutting point between 
“borderline” and “dull normal” was set at 
minus 2 PE, the 8.9th percentile. In this study, 
the class essentially normal was defined as at 
or above the 9th percentile; very low was de- 
fined as below this point. 

The results are shown in the following 
contingency table: 

SRA Low SRA Normal 
Wechsler normal 7 13 
Wechsler low 12 0 

Thus it was shown that any patient scoring 
at the 9th percentile or higher on the SRA 
scores above this mark on the Wechsler. ‘he 
converse relationship does not hold. The 
correlation (product-moment of standard 
scores derived from percentiles) is .75; but the 
predictive value for our purpose is greater than 
the correlation shows, since an essentially 
normal score on the SRA apparently assures a 
like score on the Wechsler. Thus the SRA 
may be regarded as one-sided instrument, cap- 
able of ruling out very /ow intelligence but not 
capable of confirming it. 

Conclusion. On the basis of the 32 cases, it 
was concluded that the SRA Non-Verbal Form 
was a useful screening device for patients with 
behavior disorders when the only task is to 
select those high enough in intelligence to be 
considered essentially normal. Patients without 
brain damage who score at or above the 9th 
percentile on the SRA may be considered of 
normal intelligence, while those scoring below 
this point need further evaluation to obtain a 
broader based estimate of intellectual level. 
Brief Report 
Received June 25, 1954. 
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Some Limitations in the Prediction of 
Infrequent Events 
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Since the detection of suicidal patients is 
an important responsibility of clinical psychol- 
ogists, a number of investigators have proposed 
or evaluated certain signs, configurations, and 
items from psychological tests for their iden- 
tification [1, pp. 290, 326; 7; 8; 9; 12; 14; 
15; 18; 19; 21]. The purposes of this paper 
are (a) to suggest modifications in the classi- 
fication of patients in suicide research, (6) to 
emphasize some of the limitations inherent in 
the prediction of suicide, and (c) to indicate 
the general applicability of such limitations in 
the prediction of any other behavior or event 
of infrequent occurrence. 


Classification of Patients in Suicide Research 


The term “suicidal” has been used to describe 
patients demonstrating heterogeneous kinds of 
behavior, such as (a) suicide thoughts, (4) 
suicide threats, (c) suicide attempts, and (d) 
severe depression. In previous research on sui- 
cide predictors, a common practice has been the 
combining of test data from such individuals 


with data from those who actually committed 
suicide. 


Differentation of “Suicidal” Subgroups 


The need for more precise classification in 
suicide research has been indicated by Farbe- 
row [6] and by Rosen, Hales, and Simon 
[16], who demonstrated that “suicidal” 
subgroups can be differentiated by means of 
psychological tests. The two studies closely 
agreed in indicating that, as a group, patients 
who express suicide thoughts or suicide 


1From the Neuropsychiatry Service, VA Hospital, 
Minneapolis, Minnesota, and the Divisions of Psy- 
chiatry and Clinical Psychology of the University 
of Minnesota. 
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threats are much more severely disturbed than 
either patients who have made a suicide at- 
tempt or patients-in-general, and that patients 
in the latter two categories are similar to each 
other. 


Suicide Rates 


Additional evidence of the need for greater 
refinement in the classification of patients in 
suicide research is provided by statistics on the 
incidence of suicide. Suicide is an extremely 
infrequent event, and very few of those patients 
who are severely depressed or who express sui- 
cide thoughts or threats actually commit sui- 
cide. Thus, data from such patients should not 
arbitrarily be grouped under the class term 
“suicidal.” Examination of the following 
data on suicide rates provides substantiation of 
these statements. 


Suicide rate for the general population. In the 
United States, the suicide rate for the general popu- 
lation is 11.4 in 100,000 [20, p. 75]. Reliable statistics 
for the frequency of suicide attempts in the general 
population are unavailable because of the difficulty 
of determining objectively what is an “attempt.” 
Data on the incidence of suicide among psychiatric 
patients are sparse; they may be classified in two 
ways, according to occurrence: (a) after termina- 
tion of treatment and (4) during hospitalization. 

Posttreatment suicide rates among former psychi- 
atric patients. Only four published studies could be 
located which reported data on the incidence of sui- 
cide among patients after termination of treatment. 
The authors of these reports obtained their data 
primarily by correspondence. The first three deal 
with psychoneurotics followed up for periods of 
from 2 to 20 years. Coon and Raymond [2] found 
6 suicides among 1060 cases, a rate of .0057. Ross 
[17] reported 4 suicides among approximately 1186 
neurotics, a rate of .0034. Denker [3] discovered 3 
suicides among 707 cases, a rate of .0042. Denker’s 
cases had not been psychiatric patients, but had been 
diagnosed as neurotic. A fourth study, by Holt and 
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Holt [10], reported 2 suicides among a mixed diag- 
nostic group of 141 patients (a rate of .0142) with 
in a period of 30 years after hospitalization. A 
third patient in this group committed suicide while 
under treatment subsequent to his initial hospitali 
zation. 


Suicide rates for hospitalized psychiatric patients, 
The rate of suicide among patients undergoing psy- 
chiatric treatment is apparently lower, probably be- 
cause these patients are under close observation. Ross 
[17, p. 98] reported 3 suicides among 1186 neurotic 
patients, a rate of .0025. T'wo of these patients com- 
mitted suicide during the period of treatment, and 
the other, immediately after termination. Levy and 
Southcombe {13] found 21 suicides among 6509 ad 
missions (a rate of .0032) in a large state hospital 
for the period from 1936 to 1949.” 

For the sake of the present discussion, a rough es- 
timate of .0033 will be used to represent the suicide 
rate among psychiatric patients undergoing .reat- 
ment. Although the rate may be slightly higher in 
some hospitals and clinics, it is evident that suicide 
occurs very infrequently among psychiatric patients. 


Suggested Classification of Suicidal Patients 


The need for more precise classification ir 
research on suicide detection is thus empha- 
sized both by the data on the low incidence of 
suicide and by the evidence earlier cited that 
“suicidal” subgroups can be differentiated by 
means of psychological tests. In the develop- 
ment and validation of a suicide detection de- 
vice, one criterion group should consist exclu- 
sively of patients who committed suicide. 
Moreover, since a patient may undergo marked 
personality changes in the interval between test 
administration and the act of suicide, it is prob- 
ably advisable to use data obtained only a rela- 
tively short time before the suicide. ‘The most 
desirable time interval cannot be determined 
until more information on the stability of test 
responses of suicidal patients is available. ‘The 
low incidence of suicide, of course, makes it 
extremely difficult to obtain presuicide data 
from an adequate number of cases for the de- 
velopment of a detection instrument. 


Limitations of Suicide Predictors 


A suicide detection instrument, to be eftec- 
tive, must identify a fairly large proportion of 


2Dublin and Bunzel [4, p. 311] reported a suicide 
rate of 42.3 per 100,000 psychiatric patients in New 
York State hospitals for the years 1919-1929. As 
the source for this rate, they cited an article which 
could not be located because the reference is incor- 
rect. 
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suicidal patients (true positives) and should 
not misclassify a large number of nonsuicidal 
patients (false positives). “The low incidence 
of suicide is in itself a major limitation in the 
development of an effective suicide predictor, 
for in any attempt at prediction of infrequent 
behavior, a large number of false positives are 
obtained. To illustrate this point, a hypothet- 
ical suicide detection index will be devised.’ 
Development of a Hypothetical Suicide 

Detection Index 

Assume that a suicide detection index was 
developed from test data obtained from a psy- 
chiatric patient population divided into two 
groups: patients who committed suicide dur- 
ing treatment (Suicide population), and pa- 
tients who did not commit suicide during 
treatment (Nonsuicide population). Suppose 
that the steps in the development and valida- 
tion of the index were as follows: (a) A ran- 
dom sample of patients was selected from each 
of these populations; (4) an index was devised 
which consisted of test data which significantly 
differentiated the two criterion samples; (c) 
all of the cases in the two samples were scored 
on the index; (d) a cutting line was estab- 
lished so that an equal percentage, say 80%, 
of the patients in each sample were correctly 
classified ; (e) for the purpose of cross-validat- 
ing this cutting line with new Suicide and 
Nonsuicide samples, every psychiatric pa- 
tient over a period of years was scored on the 
index, and these scores were not divulged so 
that they could have no influence on the treat- 
ment of the patients. 


Cross-validation of the index. On the basis 
of the estimated suicide rate of .0033, there 
would be approximately 40 suicides among 
12,000 patients. Assume that, in cross-validat- 
ing the predetermined cutting line, only a 
slight amount of shrinkage occurs so that 
75% of the patients in each of the two popu- 
lations can be correctly identified. Table 1, 

8The writer is indebted to Dr. Paul E. Meehl for 


indicating the importance of an inverse probability 
approach to the problem of suicide detection. 


*The figure of 75% is used because it probably 
represents the maximum effectiveness generally 


achieved with cross-validated prediction and classifi- 
cation devices. For example, Ellis and Conrad [5, 
p. 406], in evaluating the effectiveness of the per- 
sonality inventories used for screening “maladjusted” 
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Table 1 


Number of Patients in the Suicide and Nonsuicide Populations 
Identified by a Hypothetical Suicide Detection Index 


with Three Different Cutting Lines 


Actual behavior 
Total with 


Behavior predicted Did commit Did not com predicted 
by index suicide mit suicide behavior 
Number Te Number Te 
A. 75% True Positives; 75%o True Negatives 
Will commit suicide 30 ( 75%) 2990 ( 25% 302 
Will not commit suicide 10 ( 25%) 8970 ( 75% £92 
Total with actual behavior 40 (100%) 11960 (100% 120 
B. 60% True Positives; 90% True Negatives 
Will commit suicide 24 ( 60%) 1196 ( 10% 122 
Will not commit suicide 16 ( 40%) 10764 90% 1078 
Total with actual behavior 40 (100%) 11960 (100% 12 





C. 2.59% True Positives; 99.5% True Negatives 


Will commit suicide i ( 25%) 60 ( .5%) 61 
Will not commit suicide 39 ( 97.5%) 11900 ( 99.5% 11939 
Total with actual behavior 40 (100.0%) 11960 (100.0%) 1200 


Part A, indicates the number of patients in 
each population who are identified by the in- 
dex. In the Suicide population, only those pa- 
tients whose scores are above the stipulated 
cutting line are correctly identified (true posi- 
tives), whereas in the Nonsuicide population, 
only those whose scores are below this line are 
accurately classified (true negatives). The ef- 
fectiveness of the index must be evaluated in 
terms of the number of correct identifications, 
in both populations, among all the patients pre- 
dicted to be suicides. 

It can be seen that with the use of the index 
30 of the 40 suicides could have been correctly 
predicted. However, of the 11,960 patients in 
the Nonsuicide population, 2990 would also 
have been predicted suicides (false positives). 
Since every patient scoring above the cutting 
line would have been predicted to commit 
suicide, 2990 out of 3020, or 99% of such pa- 
tients are misclassified. Obviously, such a sui- 
cide detection index would have no appreciable 
value, for it would be impractical to treat as 
suicidal the prodigious number of misclassified 
cases. 

Type of prediction made with the index. 
In planning the initial development or cross- 





from “normal” servicemen during World War II, 
found that, in general, roughly 75% of each group 
were identified. 


validation of any psychological instrument, it 
decision 
which is to be made with the test results. In 
the prediction of suicide, the test results must 
be evaluated solely in terms of the 
correct classifications among those 
dicted to be suicides. In Table 1, Part A, 
there were only 1% of such correct classifica- 
tions. Among all the patients in the Sui 
and Nonsuicide populations, however 
out of 12,000, or 75% were correctly class 
It is not appropriate or meaningful ‘to evaluate 
the index in terms of the latter percentage, for 
there is no need to identify the patients in the 
Nonsuicide population. In fact, without any 
test or other information, all patients could be 
predicted to be nonsuicidal, and the prediction 
would be correct in 99.67% of all cases. 
Cross-walidation of an elevated cutting line. 
For a suicide detection index to have any con- 
ceivable value, the number of false positives 
must be drastically reduced. This could have 
been effected in the original development of the 
index (step d) if a much higher cutting line 
had been established. Assume, for example, 
that when this elevated cutting line was cross 
validated (step ¢), only 10% of the nonsuicide 
cases scored above the line (i.e., 


is essential to consider the kind of 
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positives). Assume also that the new cutting 


still a liberal estimate 
the number of correctly classified suicide pa- 


line reduced to 60% 
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tients. Table 1, Part B, illustrates the results 
of applying the index to the Suicide and Non- 
suicide populations. Of the 40 patients who 
did commit suicide, 24 could have been iden- 
tified by the index. There would still be 1196 
false positives, however, so that if every pa- 
tient who scored above the cutting line were 
called suicidal, the number of correct classi- 
fications would be only 24 out of 1220 or 2%. 
The index, therefore, would still be an im- 
practical instrument because of the large num- 
ber of false positives. 


Further reduction of the number of false 
positives. With every elevation of the cutting 
line for the purpose of reducing the number 
of false positives, there will be a decrease in 
the number of correct predictions for the Sui- 
cide population (true positives). If the origi- 
nal cutting line were raised so that in the cross- 
validation procedure almost all false positives 
were eliminated, it might still be possible to 
detect one or two cases in the Suicide popula- 
tion by means of the index. Table 1, Part C, 
illustrates such a hypothetical situation. Al- 
though only 1 in 61, or less than 2% of the 
predicted suicides, is correctly identified, the 
important consideration is that the number of 
false positives has been reduced to a relatively 
manageable figure. The writer can think of 
no other event of relevance to clinical psy- 
chologists for which one might settle for such 
a low level of psychometric prediction. The 
attitude of hospital administrators, however, 
is that suicides must be prevented at almost 
any cost. Therefore, if the items comprising 
the index could be administered to all patients 
without undue expenditure of clinical time, the 
results in this apparently unique type of prob- 
lem might be considered of some use. 


Prediction within a Population Restricted 
to Certain Diagnostic Groups 


It can be seen that it is very difficult to pre- 
dict effectively any event with an extremely 
low rate of occurrence because of the large 
number of false positives. Another approach 
to the problem of reducing the number of false 
positives might be the development of an in- 
dex within a population consisting only of those 
diagnostic groups with the highest rates of sui- 
cide. The advantage of this procedure, of 
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course, is that the rate of suicide within this 
restricted population would be higher than that 
in the entire psychiatric patient population. 
Two immediate obstacles to the use of this ap- 
proach are: (a) There is virtually no published 
information on suicide rates for given diagnostic 
categories. The writer could locate only one re- 
port [4, p. 311] which provided adequate data 
of this kind. (4) As a smaller number of sui- 
cides occur within a restricted population than 
in an entire psychiatric patient population, many 
years would have to elapse before a sufficient 
number of cases with presuicide data could be 
collected. It would be necessary, therefore, to 
conduct a coordinated research program in 
many installations. 


Additional limitations of the restricted pop- 
ulation approach. It would eventually be pos- 
sible to obtain information on suicide rates 
within diagnostic groups, and it is also con- 
ceivable that a sufficient number of cases with 
presuicide data could be collected. Even if 
these two difficulties were surmounted, how- 
ever, the procedure of developing and apply- 
ing an index within a population restricted to 
one or several diagnostic groups has several 
limitations: (a) The rate of suicide would 
still be low; probably no group could be found 
with a suicide rate higher than .02, and there 
would still be an excessive number of false 
positives. (4) A considerable proportion of sui- 
cidal patients would not be included in the 
restricted population because of (7) the limited 
reliability of diagnosis, and (ii) the occurrence 
of suicide in diagnostic categories not included 
in the restricted population [4, p. 311; 13). 
(c) Variables differentiating suicide and non- 
suicide patients would be more difficult to ob- 
tain within a somewhat homogeneous subgroup 
than within the whole psychiatric population. 


Prediction Based on Clinical Judgment 


The same kinds of difficulties which occur 
in psychometric prediction of suicide are en- 
countered in prediction based on “clinical 
judgment.” In clinical practice a large number 
of false positives are also obtained. The clini- 
cian, in referring to a patient as “suicidal,” is 
willing to err on the safe side and treat a great 
many patients as though they were very likely 
to commit suicide. 





Limitations in Prediction of iafrequi 


Necessary Basic Research 


It has been shown that the effectiveness of 
a suicide detection instrument is a function of 
(a) the rate of suicide within a given patient 
population (complete or restricted), and (6) 
the percentage of correct classifications both in 
the suicide and in the nonsuicide subgroups of 
the given population. Other things being equal, 
a given index will be most effective in the 
population with the highest suicide rate. Also, 
within a population with a given suicide rate, 
that index will be most effective which correctly 
classifies the highest percentage of cases in each 
of the population subgroups. 

An essential prerequisite to the development 
of an effective detection device is the acquisi- 
tion of a large body of information concerning 
the distinguishing characteristics of patients 
who commit suicide. To obtain such informa- 
tion, patients must be studied intensively by 
means of comprehensive social histories, be- 
havior ratings, and psychological tests. The 
differentiating data thus collected may be uti- 
lized in either or both of the following ways: 
(a) to exclude from consideration as potential 
suicides a large proportion of psychiatric pa- 
tients in order to obtain a restricted population 
with a higher suicide rate; (4) to find a large 
number of discriminating predictor variables 
which will correctly classify a high percentage 
of each of the suicide and nonsuicide sub- 
groups. 


Limitations in the Prediction of Other 
Infrequent Events 


Clinical psychologists deal with virtually no 
other events as infrequent as suicide. One ex- 
ception might be the prediction of homicide, 
for the incidence of homicide in the general 
population is about half that of suicide [20, p. 
75]. The limitations of prediction discussed in 
this paper, however, apply in lesser degree to 
any other behavior as its rate approaches .50 
in a population dichotomized according to the 
presence or absence of the behavior.® 


The problem of suicide detection provides 
an effective illustration of the importance of 
considering the rate of occurrence within a giv- 
en population of any event which is to be pre- 


5Elaboration of this point and other methodologic- 
al problems will be included in a paper by Dr. Paul 
E. Meehl and the writer which is to be published. 
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dicted. Ellis and Conrad [5] emphasized this 
point in their critical review of personality in- 
ventories used for screening purposes in the 
military services. ‘hey indicated that a large 
number of false positives were unavoidable be- 
cause of the relatively small proportion of 
“maladjusted” individuals. They listed some 
studies which were misleading because the 
false positives were reported only in terms of 
percentages rather than the actual number of 
cases. Moreover, they reported some investi- 
gations in which the normal and abnormal 
samples were equal in number, and which 
were evaluated as though the proportions in 
the population were also equal, i.e., .50 rather 
than roughly .95 and .05. Ellis and Conrad 
indicated that such studies provide a spurious 
underestimate of the number of false positives 
in the population. Hunt [11, p. 214], in re- 
viewing the Ellis and Conrad article, further 
elaborated on these points because they are 
generally overlooked. 

In the clinical psychology literature, popu- 
lation rates of the events under study are rare- 
ly reported or considered. However, the effec- 
tiveness of any method for the prediction of 
the behavior of individuals cannot be evalu- 
ated properly without at least a rough esti- 
mate of the frequency of the behavior in the 
population being studied. 


Summary 

A number of investigators have proposed 
certain signs and configurations from psycho- 
logical tests for the detection of suicidal pa- 
tients. The purposes of this paper have been 
(a) to suggest greater refinement of the classi- 
fication of patients in suicide research, (4) to 
emphasize some of the limitations inherent in 
the prediction of suicide, and (c) to indicate 
the general applicability of such limitations in 
the prediction of any other behavior or event 
of infrequent occurrence. 


Classification of patients in suicide research. 
The need for more precise classification of pa- 
tients in suicide research was emphasized both 
by (a) evidence that “suicidal” subgroups can 
be differentiated by means of psychological 
tests, and (4) data on the low incidence of 
suicide. Suicide is an extremely infrequent 
event, and very few of those patients who are 
severely depressed or who express suicide 
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thoughts or threats actually commit suicide. 
Thus, data from such patients cannot arbi- 
trarily be grouped under the class term 
“suicidal.” It was suggested that in the develop- 
ment and validation of a suicide detection 
device, one criterion group should consist exclu- 
sively of patients who committed suicide and 
for whom test data were obtained a relatively 
short time before the suicide. 


Limitations of suicide predictors. The low 
incidence of suicide is in itself a major limita- 
tion in the development of an effective suicide 
detection device, for in the attempt to predict 
suicide or any other infrequent event, a large 
number of false positives are obtained (patients 
incorrectly classified as suicides). To illustrate 
this point, a hypothetical suicide detection in- 
dex was developed and cross-validated within 
a psychiatric patient population. It was dem- 
onstrated that such an index would have no 
practical value, for it would be impossible to 
treat as potential suicides the prodigious num- 
ber of false positives. With elevations of the 
cutting line of the index so as to reduce the 
number of false positives to a practical level, 
it was estimated that the number of correctly 
identified suicidal patients (true positives) 
would also be drastically reduced, perhaps to 
zero. 


Another approach to the problem of re- 
ducing the number of false positives is the de- 
velopment of an index within a population 
consisting only of those diagnostic groups with 
the highest rates of suicide. The major limi- 
tations in the procedure of studying a popula- 
tion restricted to one or a few diagnostic 
groups are as follows: (a) The suicide rate 
would still be so low as to produce an exces- 
sive number of false positives. (4) A consider- 
able proportion of suicidal patients might not 
be included within the population because of 
both the limited reliability of diagnosis and the 
occurrence of suicide in many other diagnostic 
groups. (c) It would be more difficult to ob- 
tain variables differentiating between suicidal 
and nonsuicidal patients within one or a few 
diagnostic groups than within the whole psy- 
chiatric population. 


The difficulties in predicting events of low 
incidence are not restricted to psychometric 
prediction, but are likewise encountered in pre- 
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diction based on “clinical judgment.” A sui- 
cide detection device is not feasible until much 
more is learned about the differential charac- 
teristics of patients who commit suicide. 


Limitations in prediction of other infre- 
quent events. The problem of suicide detec- 
tion provides an effective illustration of the 
importance of considering the frequency with- 
in a given population of any behavior or event 
which is to be predicted. The limitations of 
prediction discussed in this paper apply in les- 
ser degree to any behavior as its rate approach- 
es .50 in a population dichotomized according 
to the presence or absence of the behavior. 
The effectiveness of any method for the pre- 
diction of behavior for individuals cannot be 
evaluated properly without at least a rough es- 
timate of the frequency of the behavior in the 
population being studied. 


Received March 30, 1954. 
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Jntelligence Factors in Irregular Discharge 
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Among Tuberculosis Patients ”’ 


John R. Thurston and George Calden 


VA Hospital, Madison, Wisconsin 


Irregular discharge from tuberculosis hos- 
pitals is a problem of major medical and social 
importan: ¢. For each patient who remains until 
maximum hospital benefit has been achieved, 
there is another patient who leaves the hospital 
prematurely, without medical approval, i.e., 
receives an irregular discharge [1]. This sug- 
gests that the current tuberculosis therapy pro- 
gram can be applied effectively to only half of 
the hospitalized patients. From a public health 
point of view, the seriousness of this problem 
cannot be overemphasized. 

It has often been hypothesized that low in- 
telligence may be characteristic of the patient 
who leaves irregularly. This study was design- 
ed to answer the following question: Are pa- 
tients who leave the hospital irregularly less in- 
telligent than those who remain to receive a 
medically sanctioned discharge? 

The Ss in this study were 182 male tubercu- 
losis patients admitted to this hospital during a 
one-year period. All Ss were given a short-form 
Wechsler-Bellevue during their second month 


of hospitalization. The abbreviated form con- 

1From the Veterans Administration Hospital, 
Madison, Wisconsin. 

2An extended report of this study may be obtained 
without charge from John R. Thurston, Psycho- 
somatic Section, VA Hospital, Madison, Wisconsin, 
or for a fee from the American Documentation In- 
stitute. To obtain it from the latter source, order 
Document No. 4340 from ADI Auxiliary Publica- 
tions Project, Photoduplication Service, Library of 
Congress, Washington 25, D. C., remitting in ad- 
vance $1.25 for microfilm or $1.25 for photocopies. 
Make checks payable to Chief, Photoduplication 
Service, Library of Congress. 


sists of six subtests: Information, Comprehen- 
sion, Digit Span, Similarities, Block Design, 
and Picture Completion. 

The irregular discharge group consisted of 
the first 50 Ss in the sample to leave the hospi- 
tal irregularly. ‘he regular discharge group 
consisted of the first 50 Ss in the sample who 
obtained medically approved dis- 
charges. 

Analysis of the W-B test results indicates 
that the tuberculosis patient group, as a whole, 
is of average intelligence (mean FIQ = 
107.59). 

A comparison of the mean IQ scores and 
subtest scores of the 50 regularly discharged pa- 
tients (VIQ = 107.72, PIQ= 112.48, FIQ 
= 110.62) and 50 irregularly discharged pa- 
tients (VIQ = 105.20, PIQ= 108.14, FIQ 
= 107.20) reveals no statistically significant 
differences. These results suggest that on a 
group basis, there is little difference in intelli- 
gence between patients receiving regular and 
irregular discharges. 

In order to comprehend the relationship of 
intelligence to irregular discharge, it is prob- 
ably necessary to consider the interaction of 
individual intelligence, personality, and situa- 
tional factors, rather than to limit the evalua- 
tion to IQ scores per se. 


Brief Report 
Received July 19, 1954. 
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Rorschach Scores, Prognosis, and Course of Illness 
in Pulmonary Tuberculosis ° 


David Cohen 


VA Hospital, Coatesville, Pa. 


In recent years, more and more attention 
has been given to the study of emotional factors 
as causative agents in and as reactions to the 
acute and chronic illnesses. Pulmonary tubercu- 
losis, because of its chronicity and high inci- 
dence in the general population, has been the 
subject of many of these investigations with 
many further methodologically improved stud- 
ies still in order. To date, various studies have 
been conducted and inferences drawn concern- 
ing possible personality factors predisposing to 
and maintaining the disease; the role of envi- 
ronmental stress in precipitating the disease ; 
the psychological concomitants of the illness; 
relationship between personality traits and such 
variables as illness severity, course of the di- 
sease, the patient’s acceptance of hospitaliza- 
tion, and the arrest and relapsing qualities of 
the disease ; the selection of those patients most 
likely to leave the hospital against medical ad- 
vice; and social and psychological factors in 
the rehabilitation of the tuberculous patient. 
Barker, Wright, and Gonick [1] make an ex- 
cellent critical review of this literature. Other 
valuable critical reviews in this general area 
are offered by Bell, Trosman, and Ross [2, 3], 
and Windle [10]. 


Problem 


The present study is an attempt to measure 
the ability of the Rorschach technique, singly, 
to predict two years beforehand the medical 
progress rate of the pulmonary tuberculosis 
disease process. Rate of medical progress is de- 
fined as the degree to which a patient’s medical 
progress conforms to his expected progress based 
on the patient’s over-all medical history. The 


1From the Veterans Administration Hospital, 
Coatesville, Pennsylvania. This study was effected 


at the Veterans Administration Hospital, Butler, 
Pennsylvania. 


medical progress ratings, as noted below, were 
independently made by three medical experts 
two years after the Rorschach protocols were 
obtained. 


Procedure 


The Rorschach protocols used in the present 
study were obtained from 45 male, white vet- 
erans hospitalized and being treated for active 
pulmonary tuberculosis at the Veterans Ad- 
ministration Hospital, Butler, Pennsylvania. 
The Rorschach data were originally obtained 
and analyzed by Newton [9] in his compar- 
ative study of the personality traits of patients 
with far advanced pulmonary tuberculosis as 
differentiated from those with minimally and 
moderately advanced tuberculosis. Twenty- 
four months after the original test administra- 
tion, three physicians from the Tuberculosis 
Service of the hospital, upon the writer’s re- 
quest, rated the degree to which the course of 
the pulmonary tuberculosis process in each sub- 
ject compared with the expected course of the 
disease in each case as determined by the entire 
medical history as shown in the medical chart. 
The rating scale instructions were specifically 
as follows: 


“Please rate on the attached scale the degree to 
which the course of the pulmonary tuber -ulosis pro- 
cess coincides with that which you would expect that 
disease process to have followed on the basis of the 
entire medical history as shown in the medical 
chart.” The graphic rating scale consisted of five 
serially numbered points defined as follows: “Medic- 
al course much worse than expected”; “Medical 
course somewhat worse than expected”; “Medical 
course as expected”; Medical course somewhat better 
than expected”; “Medical course much better than 


expected.” 

In each case, a complete medical chart, as 
well as all X-ray plates, were available for an- 
alysis. All ratings by the three physicians were 
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effected in a two-week interval so that an ap- 
proximately equal period of time elapsed be- 
tween the original Rorschach testing and the 
medical rating procedures. ‘The three raters 
were highly competent medical practitioners 
specializing for many years in the diagnosis and 
treatment of tuberculosis. ‘The three raters met 
together, reviewed the medical charts, X-ray 
plates, and other pertinent medical data, and 
each one then independently made his judg- 
ment on the above rating scale. Only those 
cases in which there was complete agreement 
between the three judges were included in this 
study. In each case, all ratings were made 
solely on the basis of the same data. None of 
the judges had close ward contact or immediate 
medical responsibility for the particular sub- 
jects of the study. This control was utilized 
in order to minimize personality bias on the 
part of the medical raters and to reduce the 
likelihood of their confounding adjustment to 
hospital routine with progress of the disease. 
The operation of this “halo” phenomenon and 
cautioning against it were pointed out by Dan- 
iels and Davidoff [4]. 

Analysis of the Rorschach protocols was 
done independently of the medical ratings so 
that there was no contamination between med- 
ical ratings and test interpretation. 

The Ss of the present study had originally 
been selected in Newton’s study in 24 pairs— 
each pair matched for age, number of months 
hospitalization for the disease, number of years 


Table 1 


Means, Standard Deviations, and Variance Ratios 
between Progress Groups on Age, Months of 
Hospitalization, and Vocabulary Score of Wechsler- 
Bellevue Intelligence Scale 











Progress Progress Progress 
Variable worse than as better than F 
expected expected expected 
N 19 15 11 
Age Mean 29.84 26.87 25.64 2.42* 
SD 7.34 6.41 3.40 
Months of Mean15.21 11.97 11.50 .88* 
hospitali- 
zation SD 9.45 6.66 8.40 
Wechsler- 
Bellevue Mean22.68 23.53 23.64 1.64* 
Vocabulary 
score SD 5.18 5.17 4.25 





* Not significant at the .05 level of confidence. 
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education, Vocabulary score of the Wechsler- 
Bellevue Intelligence Scale, marital status, vo- 
cation, severity of the disease process, and 
whether the patient was improving, declining, 
or showing no significant change in his physical 
condition. No Ss who showed any psychiatric 
pathology at the time of the study or in whose 
history there was evidence of marked emotion- 
al maladjustment were included. 

One member of each pair was diagnosed as 
either a minimally or moderately advanced case 
of active pulmonary tuberculosis while the 
other member of the pair was diagnosed as a 
far advanced case of active pulmonary tubercu- 
losis. The criteria of the National Tuberculosis 
Association [8] were routinely used by the 
medical staff in making this classification. 


Results 


Table 1 presents the means, standard de- 
viations, and variance ratios between the pro- 
gress rating groups on age, Wechsler-Bellevue 
Vocabulary scores, and months of hospitaliza- 
tion. Degree of medical progress is not signifi- 
cantly related to any of these variables. 


Since each § had originally been classified as 
to the severity of the tuberculous process, a 
contingency chi-square analysis was effected to 
determine the relationship between progress 
ratings and severity of illness. In a sense, such 
an analysis is a check upon the validity of the 
progress ratings since these ratings should not 
vary significantly with the severity of illness, 
in that the instructions to the raters were spe- 
cifically to make their ratings in terms of every- 
thing known about each S§ including the vari- 
able of illness severity. 


A chi square of 9.52, significant at <.01 
>.001 level of confidence was obtained, indi- 
cating that the three raters were being signi- 
ficantly influenced in their progress ratings by 
the variable of illness severity. Further chi- 
square analysis between each of the progress 
rating groups indicated that the medical raters 
were assigning to the far advanced tuberculosis 
group a significantly greater number of “better 
than expected” progress ratings than “as ex- 
pected” progress ratings. However, by combin- 
ing the two “better than expected” groups with 
the “as expected” group, and comparing this 
combined group with the “worse than ex- 
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pected” progress groups, the variable of ill- 
ness severity was found to be insignificant in 
the judges’ progress ratings. With this analysis, 
the obtained chi square between progress rat- 
ings and the severity of illness variable was 
found to be 1.08, significant only at the .30 
level of confidence. 

These two derived progress groups were 
compared in terms of the Harrower-Erickson 
technique of evaluating neurotic signs in the 
Rorschach test as well as in terms of 35 Ror- 
schach scoring variables. 


Harrower-Erickson [6], by isolating and 
counting nine neurotic signs in Rarschach pro- 
tocols, was able to identify 80% of the records 
of neurotic patients from the control subjects 
used in her study. Newton [9], using these 
signs, demonstrated a greater number of neu- 
rotic personality traits in the far advanced pul- 
monary tuberculous group than in the less ad- 
vanced tuberculous group. 


In the present study, the group rated as do- 
ing “as well as or better than expected” 
achieved a mean number of neurotic signs of 
3.19; standard deviation 1.90. The group rated 
as doing “worse than expected” achieved a mean 
number of 3.68 of such neurotic signs; stand- 
ard deviation 2.24. Analysis of variance reveals 
no significant difference between the groups in 
terms of the Harrower-Erickson neurotic signs. 


The 35 Rorschach scoring variables studied 
were: W, D, d, Dd, 8, R, M, FM, m, k, 
K, FK, F%, F+%, Fe, c, C’, FC, CF, C, 
P,R+%, percentage of responses occurring on 
the last three cards, average reaction time for 
chromatic, achromatic, and all ten cards, H%, 
Hd%, A%, Ad%, At%, number of content 
categories other than the preceding four, num- 
ber of card rejections, and the Anxiety and 
Hostility scores of Elizur’s method of analyz- 
ing Rorschach content [5]. The last two scor- 
ing variables are based on Elizur’s Rorschach 
Content Test procedure and have been shown 
to be related to the subject’s covert hostility 
and covert anxiety, which are manifested in 
perceptual and fantasy living rather than be- 
ing acted out. 


The H test of Kruskal and Wallis [7] 
was used to measure the significance of differ- 
ences between the two progress groups. The 
Kruskal and Wallis method is a ranking tech- 
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nique which assumes only that all observations 
are independent ; that all observations within a 
given sample are derived from a single popula- 
tion; and that the sample populations are of 
approximately the same form. It makes no as- 
sumptions about scale units and the normality 
of score distributions, neither of which are met 
by the Rorschach data and are required if con- 
ventional additive statistical techniques were to 
be used. A derived H value is treated as a chi- 
square value with the degrees of freedom 
equal to the number of samples being compared 
less one. There is a correction for ties. 


On none of the Rorschach scoring variables 
did the progress groups differ significantly. Ap- 
parently with patients of relatively young age, 
as in the population of this study, who have 
been hospitalized for active pulmonary tuber- 
culosis for a relatively short period early in their 
illness, the Rorschach technique cannot reliably 
predict the rate of recovery at least in a twenty- 
four month interval. 


Summary 

This study measured the ability of the Ror- 
schach technique, singly, to predict two years 
beforehand the medical progress rate of young, 
male, hospitalized, pulmonary tuberculous pa- 
tients being treated early in their illness. Rate 
of medical progress was defined as the degree 
to which a patient’s medical progress conforms 
to his expected progress based on the patient’s 
over-all medical history. None of 33 Rorschach 
scoring variables, nor the Harrower-Frickson 
technique of evaluating neurotic signs on this 
test, nor the anxiety and hostility variables of 
Elizur’s Rorschach Content Test were found 
to reliably predict the rate of medical progress. 
Degree of medical progress was not significant- 
ly related to age, length of hospitalization, in- 
telligence, or severity of the tuberculous pro- 
cess. 


Received May 3, 1954. 
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The recent extravagant use of psychotherapy 
as a cure-all in the area of emotional problems 
has rivaled the indiscriminate use of penicillin 
in its early days. One can push the analogy a 
step farther: while both have their legitimate 
areas of application, their usefulness has been 
clouded by the initial overselling of their po- 
tentialities. 

The problem of psychotherapy today is not 
does it work, but rather, in what particular in- 
stances does it work. All of the various schools 
of psychotherapy have made lists of character- 
istics which distinguish a good therapy prospect 
from a poor one [1, 7, 11]. These lists, for the 
most part, have been developed from long 
clinical experience, but there is little objective 
evidence available. 

Problem. An attempt was made to differen- 
tiate the most successful from the least success- 
ful therapy cases on the basis of pretherapy test 
indicators. The writer [9] has previously made 
a comparison of clients who drop out of thera- 
py contact after one or two interviews with a 
group that continued contact. He found, as 
predicted, that overt stress seemed to be a de- 
termining factor in keeping clients in thera- 
peutic contact. He also found that a lack of 
productivity on tests was characteristic of the 
group who discontinued therapy. The writer 
interpreted this lack of productivity as defen- 
siveness on the part of the client representing 
an unwillingness or inability to fully enter in- 
to the total therapeutic situation. 

The hypotheses in the present study are an 
extension of those tested in the previously men- 
tioned research. 


1. Measures of overt stress will differentiate 
the highly successful from the least successful 
therapy cases. 

2. Measures of productivity will differenti- 
ate the highly successful from the least success- 
ful therapy cases. 


1Now at the University of Illinois. 
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Previous studies. One of the surprising find- 
ings in a perusal of the literature is the lack 
of prognostic studies of psychotherapy with 
standard psychological tools. The few studies 
that do appear have been more concerned with 
the prognosis of drug and shock therapy than 
with psychotherapy. 

Bradway, Lion, and Corrigan [4] developed 
a prognostic rule on the Rorschach to predict 
the success in therapy of 16 promiscuous delin- 
quent girls. When cross validated on 20 other 
delinquent girls, the list proved to be 80% 
successful in predicting the therapeutic out- 
come. 

Harris and Christianson [12] attempted to 
match the pretherapy test results of 53 non- 
psychotic patients to their success in brief psy- 
chotherapeutic contacts. The Wechsler-Belle- 
vue scale and the traditional scoring of the 
Rorschach yielded no real differences between 
the highly improved group and the less im- 
proved group. The authors extracted Ror- 
schach factors that seemed to be of significance 
in an attempt to form a prognostic scale. Un- 
fortunately, no cross validation was done. Four 
MMPI scales, Sc, Pa, Ma, and Pd, revealed 
significant differences between the groups, with 
high scores being contraindications to a favor- 
able outcome in therapy. 

Barron [2] used the Rorschach, MMPI, 
and the Ethnocentrism scale to compare the re- 
sults of 33 adult psychoneurotics following six 
months of psychotherapy in an outpatient clin- 
ic. Therapists’ and supervisors’ ratings were 
the criteria for improvement. The Rorschach 
revealed no differences between the improved 
and unimproved groups, but the MMPI pro- 
files yielded results that agreed in part with 
the Harris and Christianson study. The un- 
improved groups scored higher on both the psy- 
chotic and neurotic scales on the pretherapy 
tests. The highest single predictor was the Eth- 


nocentrism scale, a high score on the FE scale 
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being indicative of poor therapy prospects. Bar- 
ron points out that the ethnic background of 
the therapists in this study might have had 
some untoward influence on this particular re- 
sult. 


Rogers and Hammond [15] obtained a 
sample of 109 patients drawn randomly from 
a VA Mental Hygiene Clinic population. Suc- 
cess in therapy was rated by the therapist and 
senior psychiatrist on the staff. Neither clinical 
judgment on the Rorschach nor the Harris- 
Christianson scoring list showed any positive 
findings. Ninety-nine single variables were also 
examined with similar negative results. The 
only positive finding was the relatively high 
frequency of extensor M responses in the rec- 
ords of the successful group. 


Method 

Subjects. The 76 subjects involved in this 
study were all students of the Pennsylvania 
State College who came to the Psychological 
Clinic between September 1949 and August 
1950 to obtain aid in their personal adjust- 
ment. A more detailed description of the sub- 
jects has already been given in previous papers 


[9, 10, 11]. 


Procedure. The 15 most successful clients 
and the 15 least successful clients were chosen 
on the basis of their scores on a multiple cri- 
teria scale for success in psychotherapy devel- 
oped by Tucker [17]. This scale, which pro- 
duces a single total score, consists of therapist 
ratings, judges’ ratings, client ratings, and the 
ratio of positive to negative feelings in the first 





James J. Gallagher 


and last interviews. 

Both the standard ¢ test and the Mann- 
Whitney nonparametric technique [13] were 
used to test the significance of differences be- 
tween groups. The significance of differences 
between the composite MMPI profiles of the 
Most Successful and Least Successful groups 
were determined by methods suggested by du 
Mas [5]. Since Barron [3] had developed an 
ego-strength scale which has shown consider- 
able promise in predicting response to psycho- 
therapy, the two groups were compared on 
this measure also. 

Measures. The measures of overt stress were 
as follows: 

1. The Taylor [16] anxiety scale. Since the 
short form of the MMPI was used, only 34 
items were available for this study. 

2. The Elizur [6] Rorschach anxiety scale. 

3. The number of problems checked on the 
Mooney Problem Check List. 

The measures of defensiveness or lack of 
productivity were: 

1. The number of responses on the Ror- 


schach. 


2. The number of words used in summariz- 
ing their problem on the Mooney Problem 
Check List. 

3. A defensiveness scale was developed on 
the MMPI in a previous study but proved to 
be too unreliable to be of any further use. 


The results on the above scales for the en- 


tire therapy group have previously been pre- 
sented by the author [9]. 


























Table 1 
Group Comparison on Measures of Stress and Productivity 
Group : an 
Discontinue Least Successful Most Successful 
Measure (N = 34) (N=15) (N = 15) 
Mean SD Mean SD Mean SD 
Overt stress 
Taylor anxiety 8.81 5.39 14.50 6.04 18.53 5.52 
Elizur anxiety 6.31 4.09 6.46 3.06 7.53 4.34 
Mooney problems checked 33.86 25.65 42.08 35.15 47.00 28.29 
Productivity 
Rorschach responses 20.74 11.51 21.93 9.66 26.60 11.38 
Mooney words used* 46.54 12.23 46.83 6.73 53.14 10.87 
Ego defense 
Barron scale 41.57 4.61 42.14 4.95 





© Converted into T scores. 
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Results 

Table 1 reveals the results obtained when 
comparing the Least Successful and the Most 
Successful therapy groups together with the 
results obtained previously on the Discontinue 
Therapy group, one in which the clients left 
therapy following one or two interviews. It 
may be noted that there is a constant progres- 
sion from the Discontinue group to the Most 
Successful group on both overt stress and pro- 
ductivity. Nine comparisons made on meas- 
ures of overt stress and six comparisons on 
verbal productivity show differences in the pre- 
dicted direction in every instance. The correla- 
tions between the various scales were quite con- 
sistently of a low positive order (.25 to .35) 
and could not, in themselves, account for such 
consistency. 


Table 2 


t Tests of Differences between 
Means of Groups 








Groups compared 








: 2 33 
i fi oft Gf 
& 35 5 
easure 3 a 3a ae 
ore a er. 
z j & & j 8 
a | = 
Overt stress 
Taylor anxiety 3.07** 5.59** 1.84* 
Elizur anxiety 14 90 75 
Mooney problems 
checked .79 1.50 40 
Productivity 
Rorschach responses 36 1.62 1.17 
Mooney words used 10 1.84* 1.84* 
Ego defense 
Barron scale 32 





* Significant at 5% confidence level. 

** Significant at 1% confidence level. 

Table 2 shows the ¢ tests of differences be- 
tween the means of the groups. The Taylor 
anxiety scale clearly is the most discriminating 
between the groups, and the number of words 
used in summarizing problems significantly 
separates the Most Successful group from the 
other two groups. Several of the other meas- 
ures, e.g., the number of Rorschach reponses 
and the number of problems checked, ap- 
proached but did not reach the commonly ac- 
cepted levels of statistical significance. 

The one measure that did not seem too ef- 


fective was the Barron ego-defense scale, which 
yielded no differences between the Least Suc- 
cessful and Most Successful groups. 

Table 3 shows the mean MMPI scores for 
the Least Successful and the Most Successful 
groups. Analyses of these results by du Mas’ 
methods of profile analysis revealed the two pro- 
files to be quite similar in shape, elevation, and 
scatter. The one difference that is statistically 
significant appears on the D scale on which 
the Most Successful group scored significantly 
higher before therapy began. The author has 
previously shown that this is one of the scales 
most likely to be influenced by contact with 
therapy. 


Table 3 


Comparison of Less Improved with 
More Improved on Pretherapy MMPI Scores 











Less improved 


More improved 





Scale (N = 15) (Was 3 

Mean SD Mean SD 
F 60.86 8.66 59.67 7.99 
K 54.27 7.56 $1.73 6.63 
Hs 56.33 12.54 . $6.27 13.20 
D 65.73* 10.51 75.07 16.59 
Hy 62.20 12.13 64.07 9.46 
Pd 67.00 13.15 66.40 14.41 
Pa 60.67 8.27 60.40 8.96 
Pt 68.13 13.06 71.20 10.18 
Sc 67.27 13.78 64.93 13.34 
Ma 61.33 13.53 56.60 11.40 





* Difference significant at 5% level of confidence. 

The present MMPI results differ consider- 
ably from both the Harris and Christianson 
study and Barron’s work. The major pre- 
therapy differences on the profiles are the low- 
er mean scores on the Pa and Sc scales in the 
present study. This result may be due to the 
weeding out of clients with psychosis or psy- 
chotic trends before starting therapy or may be 
due to the absence of those characteristics in 
the present college population. 


Discussion 


The results seem to justify a tentative con- 
clusion that some of the factors that are related 
to clients leaving client-centered therapy are 
also operating to a lesser degree to prevent any 
substantial success in therapy itself. Two of 
these factors seem to be lack of overt tension 
and a lack of verbal productivity as revealed 
on the pretherapy tests. An interesting exten- 
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sion of this study would be to compare the re- 
lationship between lack of productivity in the 
tests and lack of productivity in the interview 
situation itself. 


The implications here are that unless the 
client is motivated by some overt anxiety to 
change his perceptions and unless he is able to 
give of his perceptions freely with the counsel- 
or, then client-centered methods of counseling 
will produce a minimum of change whether or 
not the client leaves or remains in the situa- 
tion. This, of course, does not necessarily mean 
that any other therapeutic methods would be 
more successful until evidence is presented to 
prove that contention. 


It occurred to the writer after the data 
were collected that the statistical significance 
of the differences between the present groups 
on the stated variables depended on some as- 
sumptions that probably were not completely 
met. It is generally assumed that when groups 
are compared on one variable that the other 
relevant variables are either held constant or 
randomized. In the present case, that would 
mean that there would be as much liklihood for 
circumstances to cause clients to move from 
the discontinue category or the least successful 
category into the most successful category as 
there would be for them to go in the opposite 
direction. 


Although no concrete evidence was available, 
it seems most unlikely that the above assump- 
tion was satisfied. A number of legitimate 
reasons can be given as to why clients who 
should have been in the Most Successful group 
according to our test predictors ended in the 
Discontinue group (moved out of town or 
failed out of school) or in the Least Success- 
ful group (poor counseling). It is hard to con- 
ceive of circumstances of equal probability that 
could turn discontinue cases or minimally suc- 
cessful cases into highly successful cases. 


Thus, each of these adverse circumstances, 
present to an unknown degree, would have the 
effect of lowering the probability of getting 
statistically significant differences on the orig- 
inal variable. Therefore, any differences that 
are obtained should be viewed with a favor- 
able eye, and the experimenter should hesitate 
before throwing out data that approach but do 
not reach statistical significance under these 
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circumstances. 

It is always well to try to estimate to what 
extent the present results can be generalized 
to other situations. A possible clue is the dis- 
agreement between the present results and the 
previous findings of Barron and of Harris and 
Christianson. As has already been noted, the 
present population, as revealed by MMPI re- 
sults, is a rather homogeneous group. 

One might conclude that the present results 
will hold only if other factors are kept relative- 
ly constant. When the total population becomes 
heterogeneous, such as might be encountered in 
an out-patient mental hygiene clinic, then other 
factors such as the amount of psychotic under- 
lay, psychopathic trends, etc. may be more im- 
portant in determining prognosis than the 
amount of overt stress and lack of productivity 
mentioned here. 

The next question that needs to be answered 
is what can be done with the group, now par- 
tially identifiable, that does not seem able to 
profit very much from traditional methods of 
counseling. Here, perhaps, lies the most fruit- 
ful area of experimentation and research in the 
near future. 


Summary 


Results on pretherapy Rorschach, MMPI, 
and Mooney Problem Check List tests were 
used to compare 34 clients who discontinued, 
15 clients who showed the least gain, and 15 
clients who showed the most gain from client- 
centered counseling. 

The findings seem to support the hypothesis 
that success in client-centered counseling seems 
to be positively related to the amount of overt 
stress and negatively related to the amount of 
verbal productivity shown on the pretherapy 
tests. 


Received March 17, 1954. 
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Implications of the Performance of Delinquent 
Girls on the Rosenzweig Picture-Frustration Study ° 


Julia R. Vane 


Hempstead Public Schools, New York 


A group of 95 delinquent girls and a group 
of 50 nondelinquent girls of similar age, in- 
telligence, and socioeconomic background were 
given the Rosenzweig Picture-Frustration 
Study according to standard instructions. 

The delinquents were from a state reform- 
atory; mean CA 18.9 + 1.6, mean 1Q 91.5= 
14.1. The nondelinquents were from the sixth 
term of a vocational high school ; mean CA 16.9 
+.8, mean IQ 94.2+6.2. 

The results showed that the delinquents 
differed significantly from the nondelinquents 
and from Rosenzweig’s norms in E, J, and M, 
and from the norms in OD and NP. The de- 
linquent responses indicated less than average 
tendency to turn aggression outward, and a 
greater than average tendency to turn aggres- 
sion upon themselves or to evade it. The re- 
sponses also showed less concentration on frus- 
trating objects and greater than average inter- 
est in solving the problem. 

That the delinquents differed from the norms 
and nondelinquents was not surprising, but it 
was not expected that they would respond in a 
manner which would indicate that they were 
less aggressive than either the norms or nonde- 
linquents. 


Two other studies [1,3] indicate that when 
subjects wish to make a good impression on the 


1An extended report of this study may be ob- 
tained without charge from Julia Vane, 80 River- 
dale Rd., Valley Stream, N.Y., or for a fee from 
the American Documentation Institute. To obtain it 
from the latter source, order Document No, 4345 
from ADI Auxiliary Publications Project, Photo- 
duplication Service, Library of Congress, Washing- 
ton 25, D.C., remitting in advance $1.25 for micro- 
film or $1.25 for photocopies. Make checks payable 
to Chief, Photoduplication Service, Library of Con- 
gress. 


P-F Study, the responses differ from the norms 
in the same manner as the responses of the de- 
linquents. It would appear therefore that the 
delinquents were trying to make a favorable 
impression on this scale. 

That this ability to make a good impression 
on the P-F Study is not dependent upon in- 
telligence was revealed when the delinquents 
were divided into two subgroups and the re- 
sults compared. One subgroup (N = 45), had 
a mean IQ of 79, the other (N = 50), hada 
mean IQ of 102. There were no significant 
differences between the groups in any of the 
categories. 

The possibility that the response pattern 
shown by the delinquents may be a general 
delinquent pattern is suggested by the fact that 
the P-F responses of 110 women prisoners [2] 
deviated significantly from the norms in the 
same manner and almost to the same numerical 
degree as the delinquent responses. 

On the basis of this and other studies men- 
tioned [1, 2, 3] it seems inadvisable to use the 
P-F Study as a means of analyzing patterns of 
reactions of delinquents. 

Brief Report 
Received June 8, 1954. 
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Tactual-Kinesthetic Perception as a Technique for 
Diagnosing Brain Damage’ 


James W. Parker * 


Brooke Army Hospital, San Antonio, Texas 


It has been demonstrated that many of the 
current tests for brain damage are inadequate 
[1, 5, 11], although a few, particularly those 
stressing visual-motor performances, have 
proved effective. However, there are patients 
whose visual acuity is reported to be satisfac- 
tory, but who experience marked perceptual 
defects in other spheres as a result of brain in- 
jury [6]. Therefore, in view of DeJong’s state- 
ment [4, p. 47] that the most difficult tasks for 
brain-injured patients are those requiring the 
subject to break away from old mental habits 
and adapt to unfamiliar situations, it would 
seem that techniques placing minimal emphasis 
upon visual perception should be of importance 
in evaluating these patients. In addition, the 
assertion by Strauss and Lehtinen [13] that 
different individuals stress different sense mod- 
alities in their everyday living would suggest 
that a patient, while actually suffering im- 
pairment, could manifest a “normal” perform- 
ance on tasks requiring primarily the use of 
those modalities that he had emphasized 
throughout his life. 

Simple tests of tactual-kinesthetic perceptual 
abilities have long been included among the di- 
agnostic techniques of the neurologist; yet, in 
spite of a rather general recognition that brain- 
injured patients manifest impaired functioning 
in this sphere [4, 9, 14], carefully controlled 
research in the area has been meager. The few 

1Part of a dissertation presented to the Faculty of 
the Graduate School of the University of Texas in 
candidacy for the Degree of Doctor of Philosophy. 
The author wishes to express his thanks to Profes- 
sors Philip Worchel, Wayne H. Holtzman, Hugh 
C. Blodgett, and Lloyd A. Jeffress of the Department 
of Psychology, University of Texas, and to Colonel 
Charles S$. Gersoni, U. S. Army, for their helpful 


suggestions throughout the design and execution of 
this investigation. 


2At Walter Reed Army Medical Center when 
this research was done. 
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studies have been in most instances of limited 
application, either because of experimental de- 
sign or the specific nature of the investigations. 
In some of these studies the patients have been 
inadequately diagnosed; in others, controls 
have been lacking or not comparable with the 
experimental subjects. In the main, these stud- 
ies have been applied to special groups, such as 
young children or mentally deficient adults. 
Strauss and Lehtinen expressly state that 
“,.. although a comparative psychopathology 
of brain injury in children and adults shows 
many identities, some features . . . are manifes- 
tations of childhood” [13, p. 117] and that 
final conclusions regarding perceptual disturb- 
ances cannot be drawn because comparative 
studies with their tests, some of which accent 
tactual-kinesthetic performances, have not been 
made. 


It was the purpose of the present study (a) 
to determine whether there is a significant dif- 
ference in tactual-kinesthetic performances of 
brain-injured and non-brain-injured patients, 
(4) to investigate the possible relationship be- 
tween tactual-kinesthetic task performance on 
the one hand, and the site and extent of the 
brain lesions on the other, and (c) to compare 
the patients’ tactual-kinesthetic performances 
with their Bender-Gestalt test performances, 
emphasizing the relative discriminative power 
of the two types of measures. 


Subjects 


The experimental group consisted of thirty 
adult male hospitalized patients with definitely 
diagnosed brain injuries of fairly recent origin. 
Most of them had been injured within six to 
ten months of the time of this study. They 
ranged in chronological age from 21 to 48 years, 
with a mean age of 25 years. Their mental 
ages, obtained from the Shipley-Hartford Re- 
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treat Scale, ranged from 9 years, 6 months to 
18 years; 7 months, with an average mental age 
of 14 years, 1 month. Patients with gross 
sensory and motor defects on the dominant 
side, neuropsychiatric patients, the noticeably 
mentally deteriorated, and patients with severe 
visual defects were not included in the investi- 
gation. 

The control group of 30 adult male patients 
who revealed no evidence of brain damage was 
drawn at random from general medical hos- 
pital wards. They ranged in chronological age 
from 19 to 40 years, with an average age of 
23 years. Their average mental age was 15 
years, 2 months and ranged from 10 years to 
18 years, 7 months.*® 


Procedure 


Each S was assured that the purpose of the 
examination was to facilitate understanding of 
his condition, and he was then administered 
the Bender Visual-Motor Gestalt Test, an ex- 
amination frequently administered first in a 
battery because its innocuous appearance is be- 
lieved to alleviate anxiety. Without knowledge 
of which were “organic” and which were “non- 
organic” records, the reproductions were scored 
according to the system devised by Pascal and 
Suttell [12]. To see if this test would differ- 
entiate brain-damaged from uninjured indi- 
viduals, the results of the two groups were 
compared. In addition, Bender-Gestalt test per- 
formances were compared with those on tactu- 
al-kinesthetic tasks to determine whether the 
latter techniques would reveal the presence of 
brain damage in Ss not detected by the Bender- 
Gestalt test. 

Following the Bender-Gestalt test, and af- 
ter testing each individual for stereognostic 
ability by having him identify several tactually 
perceived objects, Ss were administered the 
tactual-kinesthetic tasks. The materials con- 
sisted of fourteen 8 X 11 inch wooden boards 
with various patterns on them made up of par- 
tially raised thumbtacks, half with smooth back- 
grounds, and half with backgrounds of thumb- 
tacks embedded flush with the boards. In a 
pilot study four of the patterns had been 
matched with four other patterns; a “6”, for 

®The difference of one year, one month, in average 


mental ages between the two groups was not signifi- 
cant at the .05 level of confidence. 


example, was found to be of equal difficulty 
with a “9”. In the present experiment one of 
each pair of these equated patterns had a back- 
ground of thumbtacks, to determine whether 
the brain-injured group would experience 
greater difficulty than the controls in differen- 
tiating the figures from their backgrounds. 


With the boards hidden from the S’s view, he was 
asked to explore their surfaces with his dominant 
hand until he was prepared to draw what he felt. 
It was stressed that the surface should be explored 
only as long as was necessary to be able to repro- 
duce it accurately, and that while accuracy was of 
first importance, time was being recorded and would 
be consijered. Furthermore, § was requested to de- 
scribe the various surfaces while he was feeling 
them, and to make ay comments that should enter 
his mind. Verbalizations were recorded, and nota- 
tions were made concerning the manner of approach 
employed by S. 

When the first procedure was completed for all 
the stimuli, the entire series was again presented to 
S, in the same manner except that he was not asked 
to reproduce the surfaces, but to feel them until he 
was ready to choose from a series of drawings the 
one most nearly like the surface he had explored. 
For each stimulus there were seven drawings from 
which to choose. These drawings were prepared in 
such a way that the brain-injured patient’s alleged 
tendency to react concretely [8] would be elicited. 
The correct choice was always a straight-line draw- 
ing without the background represented; other 
choices consisted of drawings wherein an incorrect 
figure was made with dark spots representing tacks; 
and finally there were choices with incorrect straight- 
line figures on backgrounds representing tacks. 


Methods similar to those of Strauss and 
Lehtinen with children [13] were used to com- 
pare the drawings made by the experimental 
group in the present study with those of the 
control group on the basis of the following 
criteria: (a) the number of straight-line draw- 
ings versus the number of drawings represent- 
ing thumbtacks, (4) the number of drawings 
with only circles of the background, (c) the 
number of drawings with only the figures as 
well as the background, and (d) the number 
of figure rotations. In addition, two in- 
vestigators independently rated each draw- 
ing on a scale from zero to five. A rating 
of five was given to drawings most ac- 
curately representing the stimulus figure and 
a rating of zero for the most inaccurate repre- 
sentations. In order that the judges would not 
know whether they were rating the record of a 
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brain-injured or of a control S, all identifying 
data were removed and the were 
shuffled during the rating procedure. ‘The 
Spearman-Brown reliability coefficient for the 
two raters was .98. ‘The ratings assigned to in- 


records 


dividual drawings by each judge were averaged 
separately, and the mean of the two averages 
was an S’s final total drawing score. 

It was fully realized that the brain-injured 
Ss of this study were not ideally suited for the 
investigation of the effects of cortical localiza- 
tion. The precise location and extent of such 
lesions are almost impossible to determine. 
Nevertheless, in view of the scarcity of previ- 
ous studies in this area, it was decided to use 
the data to investigate effects of localization on 
tactual-kinesthetic performances. In all but four 
cases, wherein craniotomies were not per- 
formed, the brain surgeon or neurologist indi- 
cated on a chart, as well as he could ascertain, 
the brain areas involved in each lesion. The 
Ss were divided into groups according to the 
site of their lesions, and analyses were made to 
determine whether certain 
particularly related 


brain were 
to tactual-kinesthetic per- 
formance. Finally, the data were examined for 
significant relationships between the extent of 
the lesions and the degree of disturbance noted. 


areas 


Results and Discussion 

Bender Visual-Motor Gestalt Test. The 
mean score of the brain-injured group on the 
Bender-Gestalt was 42.4, while that of the 
controls was 26.3. The difference of 16.1 be- 
tween these means is statistically significant be- 
yond the .01 level of confidence, thereby sub- 
stantiating the assertion by Pascal and Suttell 
[12] that damage to the cortex results in high- 
er scores on the Bender-Gestalt test. 

When the median score of 28 was used as 
a cutting score, 72.4 per cent of the brain-in- 
jured group fell above this point, and 36.6 per 
cent of the uninjured scored higher. With a 
cutting score of 39, moreover, 48 per cent of 
the brain-injured Ss were included in the high- 
er group, and 10 per cent of the uninjured. 

Not all of the very high-scoring patients 
were clinically apparent “organics.” Some of 
them would undoubtedly have constituted di- 
agnostic problems were it not for their history 
of traumatic brain injury. This observation 
does not corroborate the comment by Pascal 
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and Suttell that only in extreme cases, which 
are also clinically apparent, can the Bender 
Gestalt test answer the question: “Does the 
psychological record show signs indicative of 
damage to the cortex?” [12, p. 40]. 

Object-recognition test. Because the two 
groups could not be significantly differentiated 
on the basis of their ability to recognize nine 
different objects by the sense of touch, it would 
appear that tactual acuity in the brain-injured 
group was not 
astereognosis was absent. 

Analysis of the drawings. On the basis of 


seriously impaired, and that 


the scores assigned to the various drawings of 
the the and 
groups were differentiated beyond th 
of confidence. With the median of 3 


stimuli, control 


O01 level 


as a cut 


experimental 
. 
c 
ting score, 76.6 per cent of the experimental 
group were included in the low-scoring half, 
and 26.6 per cent of the controls; 75 per cent 
of all Ss would have been correctly diagnosed 
on this basis. A cutting score of 27, moreover, 
included 46.6 per cent of the brain-injured Ss 
and 10 per cent of the uninjured. 

Although the Bender-Gestalt test approach 
ed these results, with the median as a cutting 
score it detected only one brain-injured S not 
scoring below the median on the tactual-kin 
esthetic tactual-kinesthetic 
drawing scores, however, included three brain 
injured Ss not suggested by the Bender-Gestalt 
scores. The coefficient of correlation between 
the Bender-Gestalt tactual-kinesthetic 
tests was .68. 

Unlike the reactions of the brain-injured 
children studied by Strauss and Lehtinen [ 13}, 
the adult brain-injured Ss of the present study 
could not be differentiated from the control 
group on the basis of (a) the number of Ss 
who drew straight-line forms; (4) the num- 
ber whodrew only the circles of the background ; 
or (c) the number who drew the figure as well 
as the background. The reason these results are 
so different from those found with children 
may be accounted for by the assumption that 
perception in children differs from that in 
adults, and that, as Hebb postulates” ... an 
early injury may prevent the development of 
. . . Capacities that an equally extensive injury, 
at maturity, would not have destroyed” [ 10, p. 
292). 


When the drawings were analyzed for 90 


drawings. Low 


and 
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and 180 degree rotations from the true orienta- 
tion of the stimulus figures, it was observed 
that rotation occurred in the case of eight 
brain-injured Ss and only one control 8, the 
difference being significant at the .02 level of 
confidence. Were this trend to be borne out by 
further investigation, rotations on this task, 
as on the Bender-Gestalt test [3], may prove 
to be an important pathognomonic sign. 


Reaction times. Although the brain-injured 
group scored lower than the control group on 
their reproductions of the tactual-kinesthetic 
stimuli, these injured Ss had far more objective 
experience with the task. Whereas this group 
averaged 2.07 minutes per surface, the unin- 
jured Ss explored each surface on the average 
of only 1.35 minutes—a difference significant 
at the .02 level of confidence. 


Allen and Krato [2] have speculated that 
the knowledge that he is being timed may ex- 
acerbate the brain-damaged patient’s anxieties 
relating to his physical disability and serve to 
disrupt any train of organization, and that 
such disruption of an already impaired organi- 
zational capacity may well cause slower re- 
action times in the brain-injured. In this con- 
nection Goldstein maintains that brain damage 
“causes a rise of the threshold, and retardation 
of the excitation” [7, p. 599]. There occurs, 
he states, a reduction in receptivity, causing 
the patient to take much more time to react. 
Furthermore, the apparent need for several of 
the brain-injured Ss to count each individual 
thumbtack even when advised that this was un- 
necessary may have been what Strauss and 
Lehtinen have described as “. . . an unconscious 
adaptation to order the external world in such 
a way that a positive result, and with it the 
satisfaction of achievement, can be obtained” 
[13, p. 27]. 


Multiple-choice recognition responses. For 
both. groups of Ss, choosing from a series of 
drawings the one most like the surface just ex- 
plored proved to be a difficult task. However, 
the brain-injured group made markedly lower 
scores. Their mean number of correct choices 
was 7.3, while that of the controls was 10.4. 
The difference of 3.1 is significant beyond the 
01 level of confidence. The nature of the 
drawings undoubtedly created particular diffi- 
culties for each group. The Ss had to decide 
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what was most essential in each choice: (a) 
the accurate representation of a figure made 
with a straight-line drawing, (4) an incorrect 
figure consisting of dots representing tacks, or 
(c) an incorrect figure having a background 
representing tacks. It was thought that the 
latter two types of responses were more concrete 
in nature and might, thereby, be made more 
frequently by brain-injured patients. Contrary 
to this expectation, however, only 68 per cent 
of the brain-injured patients’ incorrect choices 
were of the “concrete” type, whereas 73 per 
cent of the incorrect choices of the uninjured 
patients were. This analysis, therefore, failed 
to reveal any particular concrete mode of ap- 
proach that these brain-damaged patients may 
have been experiencing. 

Comparison of thumbtack-background ver- 
sus smooth-background stimuli. Four of the 
stimulus figures had been equated for difficulty 
with four others, and one set of these was giv- 
en a background of randomly embedded flat 
thumbtacks while the other set was left with 
smooth backgrounds. The performances of Ss 
on these two sets of stimuli were then analyzed 
in terms of (a) time spent exploring the sur- 
faces, (4) scores made on the drawings of the 
stimuli, and (c) ability to make correct choices 
in the multiple-choice experiment. Not one of 
these variables showed a significant difference 
between the two groups. The stimuli with 
thumbtack backgrounds offered special difficul- 
ties to each group, and both groups performed 
faster and more accurately on the stimuli with 
smooth backgrounds. The brain-injured Ss 
manifested practically the same enhancements 
on the smooth backgrounds, and the same de- 
gree of impairment on the thumbtack back- 
grounds as did the control Ss. 

Qualitative analysis of the tactual-kines- 
thetic performances. Although the present in- 
vestigation offers no evidence that the brain- 
injured Ss experienced greater difficulty than 
the uninjured in differentiating the figures 
from their backgrounds, and while no quanti- 
tative evidence suggesting an otherwise singu- 
larly concrete approach has been elicited, closer 
scrutiny of the various performances suggests 
that they may in fact have been more or less 
confined to a concrete type of behavior. In ad- 
dition to the aforementioned tendency for some 
of the experimental Ss to count all the tacks, 
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apparently as if they could not perform at all 
without doing so, several of this group literally 
drew individual thumbtacks in profile in their 
attempted reproductions, seemingly being un 
able to arrange the separate aspects of a figure 
into an integrated whole. The immediate ap 
prehension of this situation for these Ss ap 
peared to be simply thumbtacks in a board, and 
this is what they drew. Goldstein emphasizes 
that brain-damaged patients cannot transcend 
this “immediate apprehension of the given 
thing or situation in its particular uniqueness”’ 
[8, p. 


causes them to be “directed by the immediate 


3], and that their concrete approach 


claims which one particular aspect of the ob- 
ject makes’ [8, p. 3]. Furthermore, he 
maintains that because this experienced unique- 
ness 


exerts demands 


upon the patient, it is 
practically impossible for him to realize other 
functions of the object, or to think of it as an 
example or representative of a larger whole. 

Another feature of the brain-injured Ss’ per- 
formances was their marked tendency to re- 
main silent while exploring the various sur- 
faces, in spite of the repeated instruction to tell 
the examiner whatever came into their minds. 
It seemed that most of these Ss were unable 
to perceive the stimuli and, at the same time, 
tell what they were doing or thinking. This 
effect, too, has been observed by Goldstein [8], 
who subsumes it under disturbances in abstrac- 
ting. He asserts that the more difficult per- 
formances for the brain-injured are those which 
demand that they take account of what they 
are doing, and that such tasks will suffer more 
than those not requiring reflection. 


Three dysfunctions listed by Goldstein 
would seem to account in part for many of the 
disturbances experienced by the brain-injured 
Ss on the tactual-kinesthetic tasks of the present 
investigation: (a) an inability to “account for 
acts to oneself; to verbalize the account,” (6) 
an inability to “hold in mind simultaneously 
various aspects,’ and (c) an inability to “grasp 
the essential of a given whole; to break up a 
given whole into parts, to isolate and to synthe- 
size them” [8, p. 4]. 

Localization and extent of lesions. The 
drawing scores of the patients whose lesions 
could be macroscopically localized were cate- 
gorized according to the brain areas involved. 


Bec ause there are so tew case within Cal h 


category, statistical analyses of the data were 


limited. ‘The mean score for those whose lesions 


involved the parictal areca wa compared with 
the mean for those whose lesions involved 
frontal areas, and the difference between 
these means was not significant. In order 


A relat onship be 


cores and whether the lesion 


to determine if there wa 
tween drawing 
was in the dominant or nondominant cerebral 
hemisphere, the average score for patients whose 
lesions were in the dominant hemisphere was 


compared with that for patients who had le 


sions in the nondominant hemisphere, and the 
difference also was not significant. 

Finally, to see if tactual-kinesthetic drawing 
cores were related to extent of brain damage 
, ar. 
the records were divided according to whether 

11 | 
the injury was large or small. Lesions which 

; | 
appeared to involve more than approxin 
one-fourth of one hemisphere were considered 
to be large id those nvo if ere or 
sidered small. ‘he average score for patients 


and for those w th 


31.7. Although 


there was a difference of 9.4 in favor of the 


with large lesions was 22.3 
small lesions, the average was 
latter group, this was not significant even at 
the .20 level of confidence and, therefore, could 
easily have occurred by chance. 


Summary 


The purpose of this study was (a) to investi- 
gate the performances of brain-injured patients 
on tactual-kinesthetic tasks, primarily to dis- 
cover whether such techniques could be used to 
differentiate brain-injured from non—brain-in- 
jured patients, (4) to determine if there is a 
relationship between tactual-kinesthetic task 
performance on the one hand and the site and 
extent of brain lesions on the other, and (c) 
to compare the patients’ tactual-kinesthetic per- 
formances with their Bender-Gestalt test per- 
formances, with particular emphasis on the 
relative discriminative power of the two types 
of measures. 

Sixty hospitalized adult male patients were 
examined, half having brain injuries of fairly 
recent origin and half revealing no evidence of 
neurological involvement. Following adminis- 
tration of the Bender-Gestalt, and after ex- 
amining each individual for stereognostic abil- 
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ity, the Ss were administered the tactual-kin- 
esthetic tasks. The materials for these tasks 
consisted of 14 wooden boards with various 
patterns on them made up of partially raised 
thumbtacks, some with smooth backgrounds, 
and others with backgrounds of thumbtacks 
randomly embedded flush with the boards. 
With the boards hidden from the S’s view, he 
was asked to explore their surfaces with his 
dominant hand, to tell whatever came to mind, 
and then to draw a picture of what he had 
felt. 

Although both the Bender-Gestalt and the 
tactual-kinesthetic techniques significantly dif- 
ferentiated the two groups, the latter technique 
appeared to offer greater discriminative power, 
differentiating the brain-injured from unin- 
jured subjects at highly significant levels on 
the basis of the following criteria: (a) scores 
assigned to drawings of the stimuli, (4) the 
time consumed in exploring the various sur- 
faces, (c) the responses to multiple-choice rec- 
ognition items, and (d) the tendency to rotate 
reproductions of the stimulus figures. In view 
of these findings, it was concluded that tactual- 
kinesthetic perception, as has been shown with 
perception in other spheres, is often decidedly 
impaired in brain-injured individuals, and that 
performances relying on such perception may 
be used to differentiate brain-injured from un- 
injured patients. 

No marked relationships were noted between 
tactual-kinesthetic task performances and the 
location or extent of the brain lesions. 


Received April 13, 1954. 
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Reviews of the literature on the sentence 
completion test [6, 9] indicate that the SCT 
is frequently used in the diagnosis of adult per- 
sonality problems. Rotter and Willerman [7 ] 
used the test as a method of studying person- 
ality in the military setting; Machover [3] 
discusses its function as a part of the total di- 
agnostic program in a psychiatric hospital; 
Stein [10] and Sacks [8] suggest methods of 
interpreting material produced on the SCT. 
Also, Sacks and Levy [9] have emphasized the 
use of the SCT as an instrument in forming 
clinical hypotheses about the emotional atti- 
tudes and mechanisms of the patient. 


In these and other studies [e.g., 1, 2, 4, 5], 
sentence stems incorporated into tests have 
been selected on an a priori basis or from clini- 
cal experience, depending upon the purpose of 
the test. Though tests so designed may furnish 
hypotheses to the clinician about patients, the 
degree to which the various stems are used in 
making those hypotheses remains unknown. 
The impression of the authors is that some 
stems consistently produce many clinical hy- 
potheses while others consistently do not. This 
research has been designed to find those stems 
which are consistently most productive. It is 
assumed that, with a given test length, the 
more useful the material is in making and sup- 
porting hypotheses, the better position the clini- 
cian is in when evaluating the patient. 


Types of SCT instructions have been dis- 
cussed by Rotter [6], Sacks and Levy [9], and 


1Acknowledgments are offered to the staff and 
trainees of Chillicothe VA Hospital and to the VA 
psychologists throughout the country who assisted in 
this research. 

This paper is published with the permission of 
the Chief Medical Director, Department of Medi- 
cine and Surgery, Veterans Administration, who as 
sumes no responsibility for the opinions expressed 
or conclusions drawn by the authors. 
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Meltzoff [4]. Instructions emphasizing (a) 
“real feelings’ and (4) speed are commonly 
recognized. These two kinds are studied here 
in relation to their effect on productivity of 
hypotheses for the test as a whole. 


Procedure 


Construction of test. Sixty-five stems were 
chosen for the test. This seemed to be the maxi- 
mum number a hospital patient could complete 
in one session without undue effort and fatigue. 
Forty stems were taken from the Rotter In- 
complete Sentences Blank, Adult Form, be- 
cause a great deal of research has already been 
done on them in other situations. ‘The remain- 
ing 25 stems were created to secure further in- 
formation about ward adjustment of patients,’ 
patient identification, and the utility of stems 
using the third person pronoun. 


Two different sets of instructions were used. 
They were alternated from patient to patient. 
One set emphasized speed and the other em- 
phasized “real feelings” as in the Rotter ISB, 
as follows: 


Set S. This is a test of your speed of thinking. 
Complete the following sentences as quickly as you 
can. Write down on each line the first idea that oc- 
you. It makes no difference whether the 
statement you write is true or not. Do not start un- 
til the examiner tells you to begin. 


curs to 


Set F. Complete these sentences to express your 
real feelings. Try to do every one. Be sure to make 
a complete sentence. 


With the subjects who received the speed in- 
structions a stop watch was used as a prop. 
The order of the sentence stems, as well as 
the instructions, was varied from subject to 
subject. The stems were reproduced on five 


2Another study is being planned to investigate 
the use of the SCT in predicting ward adjustment 








422 


pages, 13 stems to a page. The order of the 
pages was systematically alternated through- 
out both subject samples. By this means the 
subjects, and the clinical judges also, did not 
consistently respond to the same stem order on 
every test. For the second sample of 30 sub- 
jects the stems were reshuffled and a new or- 
der was arranged on each page. ‘his complete 
rearrangement of stem order was made to pre- 
vent context variables from affecting the re- 
sults. 

Subjects. ‘The SCT’s, thus constructed, were 
administered to 60 patients soon after their 
admission to Chillicothe VA Hospital. ‘The 
subjects were divided into two samples of 30 
each (Sample A and Sample B). No regard 
was given to the reason for admission or the 
diagnosis. If a patient was too disturbed or un- 
cooperative to complete as many as 55 of the 
65 stems, he was not used in the research. 


Judges. After the protocols had been com- 
pleted by the patients, they were sent in pairs 
together with a standard set of instructions to 
clinical psychologists who acted as judges. 
Thirty-five judges were selected by geographi- 
cal stratified sampling of VA neuropsychiatric 
hospitals throughout the country. Four judges 
were selected locally. This method of selection 
afforded a wide sample of theoretical approach- 
es currently used among VA psychologists.* 
Each of the 39 judges evaluated either one or 
two of the 60 protocols. No judge judged pro- 
tocols from both Sample A and Sample B. 


The judges were asked to make hypotheses 
about the subject from the completed protocol 
in the same way they would if they were writ- 
ing a case report of the individual. They were 
then asked to list these formulated hypotheses 
and to put after each one the code numbers of 
the completed sentences used in deriving the 
hypothesis. Inferences about ward adjustment 
were also made and listed separately. 


Results 


From the lists of personality inferences re- 
turned by the judges the number of times each 
judge used each stem was tabulated. No atten- 


8Among these 39 judges, 7 called their approach 
psychoanalytic; 13, modified, broad, eclectic, or neo- 
psychoanalytic; 8, eclectic; 4, learning or social 
learning; 3, Sullivanian; 1, Jungian; 1, client-cen- 
tered; 1, neo-Adlerian; 1, perceptual-cultural. 





Rue L. Cromwell and Richard M. Lundy 


tion was given to the nature of the hypotheses 
or the number of stems used for a single hy- 
pothesis. ‘The number of times any judge re- 
ferred to any one stem ranged from zero to 
six times. 

The number of times an individual judge 
referred to completed stems in a single proto- 
col ranged from 15 to 112. If a simple fre- 
quency count were used in evaluating the pro- 
ductivity of individual stems, the judges mak- 
ing frequent references to stems would have a 
_reater influence on the results than those 
judges who made infrequent references to 
them. ‘Io control this factor, the number of 
times each stem was used by each judge was 
divided by the total number of times he made 
references to ail stems, The resulting weighted 
scores allowed each judge of each protocol to 
contribute equally to the results. 

The weighted scores were summated for 
each of the 65 sentence stems for both Sample 
A and Sample B, giving each stem two produc- 
tivity scores. ‘These two sets of scores were 
correlated to see if the stem productivity was 
consistent from one sample to the other. ‘he 
product-moment correlation was +.745. 

The next step was to find a cutting point in 
the data in order to select the most productive 
stems for a final test form suitable for admin- 
istration. By inspection, a cutting point was 
determined between the 45th and 46th stem, 
arranged in order of productivity. Of the 45 
best stems in Sample A, 41 were also among 
the best 45 of Sample B. A rank-difference cor- 
relation of +.401 was obtained between the 
ranking of the best 45 in Sample A and their 
ranking in Sample B. This correlation is con- 
siderably below the correlation for all 65 stems, 
thus indicating that some stems which were 
consistently poor in productivity were elimi- 
nated. This final selection of 45 stems repre- 
sents a compromise between the goals of high 
productivity and optimal test length for be- 
havior sampling. 

In order to obtain the most reliable ranking, 
the productivity scores from Samples A and B 
were added together for each stem. With the 
Spearman-Brown formula, we can estimate the 
correlation between these final productivity 
scores (combined from A and B) and the pro- 
ductivity scores from a proposed second set of 
judges and protocols similarly obtained. This 
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inferred correlation coefficient, computed as 
+.85, is the estimated reliability of this final 


ranking of the stems: 


1. People who push me around 
2. My greatest worry 
3. My greatest fear 
4. I like to break 

5. Marriage 

6. The only trouble 
7. The future 

8. Sometimes 

9. My mind 

10. I'll try to 

11. I suffer 

12. I like 

13. ‘A mother 

14. I need 

15. I am very 

16. I failed 

17. I can’t 

18. Other people 

19. Most women 

20. I secretly 
21. My father 

22. I regret 
23. I 

24. People 

25. She 

26. He 

7. I am best when 
28. I hate 
29. I feel 

30. What annoys me 
31. Fighting 

32. When I was a child 
33. I should 

34. The happiest time 
35. He wants 


36. I want to know 

37. I wish 

38. What pains me 

39. I should obey 

40. The best 

41. What worried us was 
42. My nerves 

43. Men 

44. My friends 

45. At bedtime 


46. Back home 


47. Our 
48. Telling the truth 
49. It 


50. This place 

51. In the service 
52. They will 

53. In school 

54. Dancing 

55. She needs 

56. In the past we 
57. They 


423 
58. AWOL 
59. My superior officer 
60. They had 


61. Negroes 

62. It seemed very 
63. Reading 

64. Where I worked 
65. Sports 


The final step in the analysis was to com- 
pare the “real feeling’ instructions with the 
speed instructions in terms of total test pro- 
ductivity. The difference was not significant. 
However, the design did not permit compari- 
son of instructions from ratings by the same 
judge. This finding, therefore, may have re- 
sulted from uncontrolled judge variability that 
masked real differences between instructions. 


Discussion 


In general, the results have supported the 
hypothesis that there are consistent individual 
differences in the productivity of sentence com- 
pletion stems. The choice of stems for a test 
does make a difference as to how valuable the 
test is in sampling behavior. The correlation 
of +.745 between stem productivity scores of 
two subject samples attests to the consistency 
with which some stems are more productive 
than others. 


By setting the cutting point at 45 and drop- 
ping the poorest 20 stems we derived a test 
form that seems to be of appropriate size for 
personality evaluation in neuropsychiatric hos- 
pitals. It extracts a good deal of information 
from the average patient without becoming a 
burdensome task. Nevertheless, the test size 
could be shortened or lengthened to meet the 
needs of the individual clinician. For a short 
screening blank the first 15 stems could be used. 
Ten of the first 15 stems in Sample A were 
among the most productive 15 in Sample B. 
The test could also be used flexibly as a time- 
limit rather than a work-limit test. The patient 
could start at the beginning and work until it 
is convenient for him to stop. 

In this empirical investigation there arises 
the question of what qualities make a stem pro- 
ductive or nonproductive. To pursue this ques- 
tion, the first ten stems may be compared with 
the last ten stems in the above list. It appears 
from examining these extremes of productive 
and nonproductive stems that no single factor 
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differentiates the two groups. Many of the 
good stems refer to the first person (I, me, 
my), whereas many of the poor stems have 
third person or impersonal references. Also, 
the poor stems often refer to activities with 
little “emotional” involvement. The good 
stems, on the other hand, are often built around 
hostilities, worries, troubles, and fears. Finally, 
the poor stems refer more often to past situa- 
tions, whereas, the productive stems deal more 
with the present and the future. 

It is interesting to note that the SCT does 
not appear to be a fertile technique for investi- 
gating past and childhood experiences. Not on- 
ly are past tense references prevalent among 
the poor stems listed above, but stems such as 
“Back home” and “When I was a child” are 
low in the list. Whether it is because of the 
patients’ ability to respond, the interests of the 
clinician in using the test, or the inherent na- 
ture of the stems, it seems that the SCT in the 
hospital situation is more fertile in sampling 
statements of present attitudes and future 
goals. 

A next step in the development of this SCT 
would be a validation of clinical hypotheses 
derived from it. Such a study would yield a 
final SCT that is both productive and of 
known validity. 


Summary and Conclusions 


An incomplete sentences test of 65 stems 
was given to 60 newly admitted VA neuropsy- 
chiatric hospital patients. The order of the 
stems and two different sets of instructions were 
alternated from subject to subject. Thirty- 
nine clinical psychologists in various VA hos- 
pitals made personality inferences from the pro- 
tocols and listed the completed sentences used 
in making the inferences. 

The major findings were: 

1. A correlation showed the individual dif- 
ferences in stem productivity to be significantly 
consistent from the first sample of 30 subjects 
to the second sample of 30 subjects. 
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2. A test of 45 highly productive stems was 
devised as appropriate for psychological evalu- 
ation in neuropsychiatric hospitals. 

3. Results suggested that many of the more 
productive stems referred to the first person, 
the present and future, and “emotional” as- 
pects of the subject. 

4. No difference was found between speed 
and “real feeling’ instructions with respect to 
test productivity. 


Received April 1, 1954. 
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In clinical psychology, diagnostic tests are 
the most common source of personality descrip- 
tions. The descriptions range from short para- 
graphs, emphasizing a single salient charac- 
teristic of the subject, to elaborate many-paged 
analyses of several aspects of his psychological 
functioning. Very rarely are the conclusions 
drawn by the clinician restricted simply to a 
diagnosis. The protocol, whether in the form of 
the set of scores or a profile from a personality 
inventory or the verbatim responses to project- 
ive materials, leads the interpreter to make a 
large number of inferences, only one of which 
is a diagnosis qua diagnosis. These inferences 
seldom have a direct overt relationship to ob- 
jective characteristics of the protocol; they are 
a function of the interpreter’s experience, skill, 
personality, etc., as well as of the test itself. 

This situation is taken somewhat for granted 
with projective tests but is frequently ignored 
in considering the more objective personality 
measures such as the MMPI. The pristine 
beauty of the quantitative scores on this inven- 
tory has generally led investigators into testing 
a single inference, i.e., diagnosis, when judging 
the test’s validity. Yet in practice the MMPI 
is used quite differently. The clinician inspects 
the profile, occasionally scores additional scales, 
perhaps examines the actual responses to in- 
dividual items, and somehow during the process 
arrives at a number of conclusions which he 
embodies in a formal or informal psychological 
report. The validity, in clinical practice, of the 

‘Presented with the approval of the Chief Medic- 
al Director of the Veterans Administration. The 
statements and conclusions published by the au- 
thors are a result of their own study and do not 
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instrument would therefore seem best reflected 
by some measure of the number of correct in- 
ferences arrived at by the clinician. The pres- 
ent paper is the report of a study of the valid- 
ity, as just defined, of the MMPI. 

It is obvious that an index of validity as 
described above will vary with the clinician 
who does the interpreting and that no unique 
figure will be obtained. The authors have sug- 
gested in a previous paper [1] that the central 
tendency of such indices among competent 
clinicians interpreting a variety of subject pro- 
tocols might be considered the validity of the 
instrument itself. In this study, the results for 
a number of interpreters working with a single 
protocol are presented. They indicate only what 
can be achieved and not necessarily what 
might occur with other protocols. In addition 
the results were examined to determine the 
types of inferences made on the basis of the 
MMPI and the relationship between the resul- 
tant types and their accuracy. 


Procedure 


The general procedure has been described in 
more detail elsewhere [1] and is presented 
here in synoptic form. 


Eleven psychologists, competent in the use and 
interpretation of the MMPI and accustomed to writ 
ing psychological reports from it, were presented 
with the MMPI profile of a subject identified to 
them as male, age 25, and single.2 On the basis of 
the information conveyed to them by this profile, 
each made a Q sort [3] of 150 items of personality 
description. The continuum was from “Most True” 
to “Most False” for the subject. The items were 
a stratified sample from a total item population of 
1604 items abstracted from 17 psychological reports 
written about the subject on the basis of his TAT 
and MAPS test protocols. The 1600 items covered, 


2The test materials and case history data of the 
subject are presented in Thematic Test Analysis 


[2]. 








426 


so far as could be determined, the usual aspects of 
psychological functioning presented in psychological 
reports. 


The intercorrelations of the distribution of 
items for the 11 MMPI judges are presented 
in Table 1. 

Table 1 


Correlations among the Q Sorts of 
Interpreters* 


11 MMPI 





Inter- 
preter So 8. wis 3 Bs 2 Bi BA ee 





1 63 62 50 65 73 56 68 64 56 67 
2 68 56 62 66 51 68 72 61 55 
3 62 56 50 55 64 68 51 49 
+ 48 45 52 60 66 47 49 
5 68 42 72 51 67 62 
6 54 63 61 61 67 
7 50 64 43 48 
8 60 66 56 
9 52 55 


10 58 





* Decimal points omitted. 


The criterion consisted of the consensus of 
29 experienced clinical psychologists and psy- 
chiatrists who Q sorted the same 150 items, 
except that their sorting was made on the basis 
of a complete clinical folder about the subject. 
The folder contained medical examination 
reports, laboratory data, course of treatment 
notes, psychotherapy notes, social history, con- 
sultation reports, etc., but excluded the psy- 
chological test reports. 


A factor analysis of the intercorrelations of 
the 29 criterion judges [1] indicated that a 
single general factor would account for 90% of 
the communality among them. Accordingly, a 
new Q sort was constructed for the criterion 
itself, as follows: 


1. A multiple-regression equation for estimating 
the criterion general factor was derived using the 
criterion judges’ general factor loadings as corre- 
lations. It was found that with only three vari- 
ables (judges) a correlation of .925 with the general 
factor was secured. 

2. The Beta coefficients for these three criterion 
judges were used to weight each of their Q-sort 
scores for the 150 items. The three weighted scores 
for each item were summed to form a new composite 
score for that item. 

3. The 150 items were then ordered into the cri- 
terion Q sort on the basis of their rank position in 
the distribution of composite scores. 


To determine the validity coefficients, the 
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11 MMPI judges were correlated with this 
composite criterion Q sort. 


Results 


A cluster analysis [4] of the matrix of inter- 
correlations of the MMPI interpreters (‘Table 
1) yielded three groups of judges. Table 2 
gives the estimated correlations of each judge 
with each of the three cluster domains and also 
their communalities. (The correlation of a 
judge with the domain of which he is a mem- 
ber is analogous to a factor loading in a factor 
analysis.) ‘Table 3 gives the correlations among 
the cluster domains themselves. 


Table 2 


Estimated Correlations of Each MMPI Interpreter 
with the Cluster Domains and the Validity 
Coefficient for Each Interpreter 


———= = ETO EEE 








Cluster domain 
Judge A B j h? Validity 

2 eg) a ae a 72 
A 6 34 77 .64 71 .74 
11 41 71 63 .66 .70 

5 .78 34 .64 71 71 

B g 75 33 .76 .69 .73 
10 .70 41 62 .66 58 

3 .65 69 30 .64 .70 

58 .62 19 .62 .56 

/ 9 Fi .66 33 .69 .66 
2 .74 77 31 .66 .70 

7 63 54 71 .50 .52 





Cluster 

















Note.—Italicized figures are the correlations of the 
judges with the domain of which they are a member. 














Table 3 
Estimated Correlations among the Cluster Domains 
Cluster 7 
domain A B Cc 
A 1.00 .90 80 
B .90 1.00 81 
he .80 81 1.00 





To test the hypothesis that the three clusters 
were sufficient to account for all the common 
variance among the judges, a table of theoret- 
ical correlations was computed on the basis of 
each judge’s correlation with the cluster of 
which he was a member and the correlations 
among the domains (Tables 2 and 3). A com- 
parison with the 55 empirical correlations 
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of Table 1 indicated that 44 (76%) of the 
theoretical correlations were within one stand- 
ard deviation of the corresponding empirical 
ones, 54 (98%) were within two standard 
deviations, and only one theoretical correlation 
deviated from its empirical counterpart by more 
than two standard deviations. The mean abso- 
lute discrepancy between the two sets of cor- 
relations was .024. It appears, therefore, that 
the three clusters do adequately account for the 
common variance in the matrix. 

To identify the groups of judges, an average 
Q sort was made for each of the clusters. The 
20 Most ‘True items in each average Q sort 
were examined, and those items that appeared 
for one and only one cluster were isolated. The 
same process was also repeated for the 20 Most 
False items of the three clusters. 


The Most True items for cluster A were: 


He is in the early stages of paranoid schizophrenia. 

He has strong latent homosexual feelings. 

The psychological threat to him is really of a 
homosexual nature. 

He has an enormous amount of hostility. 

He would be threatened by interpretations given 
early in psychotherapy. 

His principal! conflict is in the sexual area. 

He has at least bright normal intelligence. 


The Most False items for cluster A were: 


He does not have much anxiety. 

He should be able to progress satisfactorily in 
psychotherapy without excessive support. 

Masturbation creates no psychological problems 
for him. 

He seems little concerned with his bodily well-be- 
ing. 

His ideation is not paranoid. 


The Most True items for cluster B were: 


He appears to be a solitary person. 

An acute break with reality has probably occurred. 

The total picture is consistent with a schizophren- 
ic disorder with potential paranoid and hebephren- 
ic coloring. 

He avoids social disapproval by engaging in soli- 
tary activity. 

He has never learned to solve his problems by 
other than avoidant means. 

He has never adequately learned social skills. 


The Most False items for cluster B were: 

He appears to be an outgoing person. 

For him, withdrawal is a relatively unimportant 
mode of reacting to frustration. 

He has little guilt feelings concerning his aggres- 
sive impulses. 

He has the well-preserved mind of a person who 


~~ 


is not psychotic. 
He is probably not disoriented. 


The Most True items for cluster C were: 


He perceives the world as consistently unloving. 

He has marked guilt feelings. 

His guilt feelings may overwhelm him 

He feels deprived of the oral gratifications of 
childhood. 

He suffers from feelings of rejection. 

In fantasy he longs for kind parents. 


The Most False items for cluster C were: 


He feels generally that he is master of his own 
fate. 

He has little orality. 

His superego is relatively mature. 


This technique of identification emphasizes 
differences among the clusters; the actual high 
degree of agreement is apparent from the cor- 
relations among the cluster domains (‘Table 3). 
However, inspection of the cluster-identifying 
items suggests that the judges of cluster A had a 
slight preference for emphasizing the nature of 
the conflicts (over sexual impulses) of the sub- 
ject; the judges of cluster B had a preference 
for emphasizing the primary defenses (with- 
drawal) of the subject ; and the judges of clus- 
ter C had a preference for emphasizing the 
affective reactions (feelings of rejection) of the 
subject. All are describing a person who is es- 
sentially schizophrenic. 


The right-hand column of Table 2 gives the 
validity figure for each judge in the form of a 
correlation with the composite Q sort described 
previously. The mean value of this validity 
figure for cluster A was .72, for cluster B, .67, 
and for cluster C, .63, with an over-all value 
for the 11 judges of .67. An analysis of variance 
indicated that the differences among the three 
clusters were not significant. 


Discussion 


The data presented above indicated that sub- 
stantial agreement may occur between descrip- 
tions of an individual based upon his MMPI 
profile and those based upon an elaborate clin- 
ical history. The degree of agreement can be 
emphasized by pointing out that the average 
general factor loading (or correlation with the 
criterion ) of the criterion judges themselves was 
only .74 [1] as compared to the average of .67 
for the test judges with the composite criterion 





428 


Q sort. Moreover, the former figure is a corre- 
lation with a criterion of unit reliability where- 
as the latter is with an estimate of that criterion 
only. Correction of the average validity figure 
of the MMPI judges for the unreliability of 
the “true” criterion estimate would further 
decrease the difference between the two figures 
cited. 

Certain cautions need to be kept in mind in 
evaluating these data, however. ‘The results in- 
dicate only what can be done by competent 
clinicians with a specific protocol and not what 
might occur with different interpreters or with 
the same interpreters and different protocols. 
The subject used in this study presented an 
ambiguous clinical picture at the time of testing 
but his MMPI profile appears to be anything 
but equivocal (see Table 4). A strong possibili- 
ty exists, therefore, that the comparison of the 
MMPI Q sorts with the criterion Q sorts test- 
ed the agreement of two sets of statements about 
a certain nosological classification rather than 
inferences about a specific person.’ One critic 


Table 4 
MMPI Scores of the Subject 














Scale Raw score Standard score 
? a. dasa Seed 
L 2 43 
F 13 73 
K 10 46 
Hs 22 90 
D 36 95 
Hy 35 85 
Pd 27 79 
Mf 36 80 
Pa 21 88 
Pt a) 111 
Sc te 120 
Ma 25 75 





was so unkind as to suggest that the data 
merely demonstrate the existence of a common 
delusional system among test and criterion 
judges. However, insofar as the statements 
made have a descriptive utility, agreement be- 
tween test interpretations and those made on 
the basis of exhaustive clinical material seems a 
desirable form of validity no matter how the 
statements are derived. 


8A further study, using a variety of subjects, psy- 
chological techniques, and interpreters to increase 
the representativeness of the results and to test this 
hypothesis, is currently being conducted. 





Kenneth B. Little and Edwin S. Shneidman 


Three clusters of judges could be found 
among the 11 judges used in this study, but the 
differences among them do not seem to be re- 
lated to their over-all validity. This would 
follow logically from the discussion set forth 
above. The variations in description represent 
different emphases upon certain aspects of the 
same disorder rather than disagreement as to 
diagnosis. Thus the average validity of the 
three clusters is about the same. 

A final conclusion from the study is that the 
Q technique has considerable value in the study 
of the reliability and validity of clinical tech- 
niques. It permits. the quantification of results 
without loss of the idiographic approach, a 
characteristic ideally suited to clinical research. 


Summary 

The validity of inferences made from the 
MMPI was tested using an interpreter popu- 
lation of 11 experts working with the MMPI 
profile of one subject. The criterion was the 
consensus, in the form of a general factor, of 
29 clinicians who Q sorted 150 items on the 
basis of a comprehensive clinical record of the 
same subject. A cluster analysis of the matrix 
of intercorrelations among the MMPI inter- 
preters yielded three groups of judges. These 
groups were described in terms of the Most 
True and Most False Q-sort items peculiar 
to each cluster. Validity figures for each MMPI 
judge were computed as correlations with a 
weighted composite Q sort of the criterion 
judges. The general results indicated that 
MMPI interpreters can achieve a level of con- 
sensual validity on the basis of the MMPI pro- 
file approximating the average general factor 
loading of the criterion judges. Certain cau- 
tions in interpreting the results were presented, 
especially in the light of the rather unequivocal 
nature of the MMPI profile used in this case. 


Received April 26, 1954. 
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The manifest anxiety scale (4 scale) was 
developed by Taylor [16] as a part of an in- 
vestigation of the relation between anxiety and 
eyelid conditioning, in order to discriminate ex- 
perimental subjects on the manifest anxiety 
continuum. Taylor selected approximately 200 
items from the Minnesota Multiphasic Person- 
ality Inventory (MMPI) and submitted these 
to clinical judges with the request that they 
select those items which they judged to be in- 
dicative of manifest anxiety according to a 
definition furnished them. The 65 items on 
which there was 80% agreement or better 
were included in the scale, although in a later 
revision the scale was shortened to 50 scored 
items [14]. For a more complete description 
of the development and revision of this scale 
the reader is referred to Taylor [17]. The 4 
scale has frequently been employed in investi- 
gations of anxiety and learning phenomena [1, 
4, 6, 8, 10, 11, 12, 13, 15, 18], and important 
conclusions have been derived from these stud- 
ies. Thus, a serious problem would seem to be 
the determination of the adequacy of the 4 
scale as a measure of manifest anxiety. 


The reliability of the 4 scale has been shown 
to vary between .81 and .96 [6, 14, 16, 17], ac- 
cording to the method employed, and thus it is 
safe to conclude that adequate reliability has 
been demonstrated. The outstanding deficiency 
in the research with the 4 scale to date is the 


1From the Veterans Administration. 


*This paper is based on a thesis submitted in par- 
tial fulfillment of the requirements for the degree 
of master of science. The writer wishes to acknow- 
ledge his indebtedness to Dr. A. W. Bendig, who 
originally suggested the area of investigation, and to 
the other members of his committee, Drs. A. D. 
Lazovik and H. W. Goodman. A special note of 
gratitude is also due Dr. Ralph Simon, Chief Psy- 
chologist at the VA Hospital, Butler, Pennsylvania, 
the installation at which the study was carried out. 


paucity of evidence concerning its validity. As 
originally developed and used by ‘Taylor the 
scale was not validated against any criterion of 
manifest anxiety external to the test itself. 
Taylor [17] has recently taken the position 
that the items of the scale may be regarded as 
an operational definition of manifest anxiety. 

However, if anxiety is regarded as a state 
manifesting itself in a wide variety of behavior, 
and if the 4 scale is conceived of as a means of 
tapping such a state, then external validation 
is clearly required. The need for external vali- 
dation is even more apparent when some of the 
assumptions of self-report tests are considered. 
Cronbach has pointed out that “the crucial 
assumptions involved are that the subject is 
willing to tell the truth to himself and to the 
investigator, and that the subject can deter- 
mine the truth” [3, p. 307]. 

The currently available studies with the 
A scale provide conflicting evidence as to its 
validity. Rosenbaum [12], in the process of 
studying anxiety and stimulus generalization, 
found that a division of his subjects into high- 
and low-anxiety groups by means of the 4 scale 
and psychiatric ratings gave similar results. 
These apparently positive findings as to the 
scale’s validity were placed in doubt by the re- 
sults of Bitterman and Holtzman [1]. @hey 
found that a division of subjects by meand, of 
extensive clinical evaluation demonstrated a 
significant relation between anxiety and condi- 
tioning, whereas a division by means of the 4 
scale alone did not produce significant findings. 
In a more recent study Holtzman, Calvin, and 
Bitterman [7] obtained A-scale and Winne- 
scale scores for a group of subjects. A correla- 
tion of .72 was obtained between the scales, 
which the authors interpreted as evidence for 
the validity of the 4 scale since the Winne 
scale is an empirically derived scale of neurot- 
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icism. Taylor [17] has recently presented some 
indirect evidence of the scale’s validity. She 
obtained the distribution of scores for 103 neu- 
rotic and psychotic subjects, and found that the 
median score was equivalent to the 98.9 per- 
centile for normal subjects. On the assumption 
that the former exhibit greater manifest anxiety 
than normals, she concluded that her findings 
seemed to indicate some relation between 4- 
scale scores and clinical observations of man- 
ifest anxiety. However, in the same article 
Taylor indicated the desirability of direct vali- 
dation using ratings as a criterion. 

The evidence cited as to the validity of the 4 
scale is in several cases contradictory, and there 
is a notable lack of formal validation studies. 
The present study was intended as a formal 
validation study using a criterion of anxiety 
external to the test itself. The particular cri- 
terion chosen was the rating by ward nurses 
of the manifestly anxious behavior of chron- 
ically ill tuberculous patients. 


Procedure 

Development of the rating scale. A seven- 
point graphic rating scale was constructed for 
each of nine aspects of manifest anxiety. The 
subtraits were drawn from Cameron’s descrip- 
tion of manifest anxiety [2, p. 249], since Tay- 
lor’s judges had used this definition in selecting 
the items of the 4 scale. The subtraits were 
intended to tap the extent to which the pa- 
tient: (a) gave exaggerated and inappropriate 
reactions on slight provocation, (4) gave gen- 
eral indications of fatigue not attributable to 
his physical condition, (c) displayed difficulties 
in elimination not explainable by his physical 
condition, (d) appeared to be easily upset, (¢) 
showed indications of general restlessness, (f) 
slept poorly, (g) displayed symptoms of nausea 
or vomiting not attributable to his physical 
state, (h) complained of difficulties in concen- 
tration or thinking, and (i) appeared to be 
generally tremulous. Verbal descriptions of the 
mid-points and end points of the scales were 
provided, and an example drawn from the 
hospital situation was provided for each of the 
subtraits. 

A trial run of the rating scale was accom- 
plished to ascertain its reliability and to provide 
information for its further refinement if this 
proved necessary. For this purpose, 20 random- 
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ly selected patients, 10 from each of two treat- 
ment wards, were independently rated by two 
nurses on each of the wards. Product-moment 
coefficients of the interjudge reliability were 
45 and .92. Since the combined ratings by two 
judges were to serve as the criterion, the Spear- 
man-Brown formula was applied and corrected 
values of .62 and .96 obtained. A ¢ test of the 
difference between the two r’s was 2.185, signif- 
icant at the .01 level. In view of this difference 
in reliability, an analysis was made of the rat- 
ings for the two wards, and it became apparent 
that the lowered reliability on the one ward 
could be chiefly attributed to gross disagree- 
ment in the rating of one subject. Discussion 
with the nurses revealed that one of them had 
strong negative feelings toward this particular 
patient, and therefore it was decided to empha- 
size the need for objectivity in instructing later 
groups of raters. 

Subjects. The subjects of the study proper 
were 93 patients undergoing active treatment 
for pulmonary tuberculosis ; they were random- 
ly selected from the resident population of the 
VA Hospital, Butler, Pennsylvania. Their age 
range was from 18 to 56 years with a median 
of 31. The sample included 80 whites and 13 
Negroes. The subjects had been given the 
group form of the MMPI as a part of a 
routine admission battery, and these tests were 
rescored for the items of the 4 scale. The 
range in obtained A-scale scores was from 1 to 
36, with a median of 13.06, a mean of 13.15, 
and a standard deviation of 6.56. The distri- 
bution was somewhat positively skewed. ‘These 
findings closely parallel those of Taylor [16] 
with a college population ; she obtained a range 
of from 1 to 36, a median of 14, and a positive- 
ly skewed distribution. 


After the 4-scale score distribution had been 
obtained, the upper and lower 27% were select- 
ed in order to test the ability of the 4 scale to 
predict manifestly anxious behavior as meas- 
ured by ratings. Ratings were collected only 
for the 50 subjects falling in the extreme 
groups. Extreme groups were selected to pro- 
vide a coarse test of the discriminability of the 
scale, corresponding to the use of the scale ex- 
perimentally. The upper and lower 27 per 
cent maximized the chance of obtaining a sig- 


nificant difference [9, p. 301]. 
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Collection of ratings. Each subject was inde- 
pendently rated by two nurses on the ward 
where he resided, and only those nurses who 
had observed the subjects for at least one month 
were used as raters. The time interval between 
testing and rating ranged from two to six 
months, with the great majority of subjects 
being rated within four months of testing. 

In collecting the ratings the experimenter 
met with the two nurses from each of the 
wards and explained the task to them. Instruc- 
tion sheets were provided, and each nurse was 
asked to rate a patient not included in the study 
in the presence of the experimenter to deter- 
mine if she understood the procedure. The 
judges were instructed not to discuss the rat- 
ings with anyone else, and to accomplish them 
independently. They were allowed one week 
to complete the ratings. 


Results 


Reliability of ratings. Reliability coefficients 
were computed by means of the interjudge 
agreement method for each of the wards 
from which subjects were drawn. The obtained 
values together with the Spearman-Brown val- 
ues are presented in Table 1. A chi-square test, 
using the z’ transformation method [5, p. 135], 
was made to test the hypothesis that the ratings 


Table 1 


Interjudge Reliability Coefficients of Ratings 
Ward by Ward 











Ward N r* rt 
— = 6 oe oe 
3 7 .67 80 
+ 7 92 96 
5 12 74 85 
6 7 383 91 
7 11 


75 86 





* Product-moment correlation coefficients. 

+ Product-moment coefficients corrected by Spearman- 
Brown formula to provide estimate of reliability of com- 
bined ratings by two judges. 
were drawn from a common population, and 
the obtained chi square of 6.295 with 5 df was 
not significant at the .05 level. Thus there was 
no basis for rejecting the null hypothesis. The 
average r, computed by the z transformation 


[5, p. 133], was .91. 


Validity of the A scale. A t test of the signifi- 
cance of the difference between the mean rat- 


ings of subjects falling in the upper and 
lower 27 per cent of the 4-scale distribution 
was made. Although the mean difference was 
in the expected direction, the obtained ¢ was 


‘Table 2 
A Comparison of the Mean Ratings of 
Extreme A-Scale Score Groups 





Group Mean SD SE t 
Upper 27% 64.60 23.01 = 4.67 
Lower 27% 55.56 21.46 4.49 1.407° 
Upper 13% 72.75 21.00 6.33 
Lower 13% 46.17 16.66 5.02 227° 





* Not significant at the .05 level of confidence. 

** Significant at the .01 level of confidence. 

not significant at the .05 level. However, the 
value of 1.407 approached significance, and in 
view of the suggestive nature of this finding 
it was decided to compute a second test of 
significance providing a coarser test of validity. 
For this comparison the upper and lower 13 
per cent of the 4-scale distribution were select- 
ed. ‘The obtained ¢ of 3.227 made it possible 
to reject the null hypothesis at better than the 


.O1 level of confidence. 
Discussion 


Reliability of the rating scale. It was felt 
that the reliability of the ratings, as demon- 
strated, was satisfactorily high. The average 
reliability of .91 compares favorably with that 
for the 4 scale itself, for which values of from 
.81 to .96 have been reported [6, 14, 16, 17]. 
Thus an incidental result of this study would 
appear to be the finding that nurses who are 
instructed in the use of graphic rating scales 
are able to rate the behavior of their patients 
reliably. Of course this conclusion must be qual- 
ified when it is recalled that the behavior here 
under investigation was readily observable and 
that the subjects were under observation for 
relatively long periods of time. 


Validity of the 4 scale. Although the original 
test of validity employing upper and lower 27 
per cent groups yielded negative findings, a 
supplementary test with more extreme groups 
produced positive results. This procedure was 
felt to be justified when the nature of the ¢ 
test is considered. Although the first test did 
not make it possible to reject the null hypothesis 
at the .05 level, it did not establish that no 
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real difference existed between the two groups. 
The supplementary test involved increasing the 
chances of obtaining a large mean difference 
only if there were a relationship between 4- 
scale scores and the ratings, and at the same 
time entailed a loss of degrees of freedom. 

The finding of a significant difference only 
with very extreme groups may account for 
some of the differences in results in studies of 
the scale. A review of these studies shows that 
those using the preselection of extreme 4-scale 
groups [13, 14, 16] have usually found a posi- 
tive relationship between A-scale scores and 
simple conditioning, whereas studies not using 
such preselection of subjects [1, 6] have gener- 
ally not demonstrated a significant relationship. 
The results of the present study are interpreted 
as supporting the validity of the 4 scale only as 
a coarse measure of manifest anxiety, and its 
experimental use is recommended only when it 
is desired to select extreme groups. 


Summary 


The present research was designed as an 
investigation of the validity of the 4 scale as a 
measure of manifest anxiety. A manifest anx- 
iety rating scale was developed to facilitate rat- 
ings which were to serve as a criterion for val- 
idation. Ratings accomplished by means of this 
scale by ward nurses were found to be satis- 
factorily reliable when reliability was deter- 
mined by the interjudge agreement method. 

The 4-scale scores for a sample of 93 hospi- 
talized tuberculosis patients were obtained, and 
extreme groups were selected so as to include 
the upper and lower 27 per cent. Each subject 
was rated independently by two ward nurses, 
and their combined ratings served as the criter- 
ion for validation. A test of significance be- 
tween the mean ratings of these two groups did 
not make it possible to reject the null hypothesis. 
A supplementary comparison between the upper 
and lower 13 per cent did make it possible. 
The results are interpreted as indicating that 
the 4 scale is valid only as an extremely coarse 
measure of manifest anxiety. 
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Recently a number of studies have been re- 
ported [1, 3, 5, 6, 7, 8, 11, 12] in which the 
subjects were selected by the use of a scale of 
manifest anxiety constructed by Taylor [9]. 
The items for this scale were selected from the 
Minnesota Multiphasic Personality Inven- 
tory (MMPI) as being indicative of manifest 
anxiety in the judgments of experts. Retest 
reliability figures for the scale range from .85 
to .89 [10] depending in part upon the form 
of the test used and the interval between tests. 
Holtzman et al. [4] have reported some valid- 
ity data in the form of a correlation with an 
independently derived scale of neuroticism by 
Winne [13]. Their validity figure is about 
.75 with some slight variation if a nonlinear 
index is used. 

It would seem, therefore, that the scale sat- 
isfactorily measures the expression of discom- 
fort, complaints about inefficient functioning, 
and general psychological malaise which can 
be classified under the heading of manifest 
anxiety. 

The manifest anxiety scale (A scale) has 
been used to date primarily for selecting cri- 
terion groups in the study of the differential ef- 
fects of anxiety on a variety of functions. It 
is obvious, however, that a scale measuring 
some form of anxiety could be of value to the 
clinician if his current tools do not permit 
such assessment. In interpreting the MMPI, 
statements about the level of anxiety of the 
subject are currently made on the basis of the 
relationship among certain of the scale scores, 
e.g., Depression (D), Psychasthenia (Pt) 
and K. If the 4 scale provides information 
not present in the scores of the MMPI scales 
or easily inferred from the relationships among 


1From the VA Hospital, Palo Alto, California. 


them, then it wovld seem a desirable adjunct 
to the standard profile. The present investiga- 
tion was undertaken to provide information 
on this point through the study of the relation- 
ship of the 4 scale to the existing scales of 


the MMPI. 


Procedure 


The product-moment correlation between 
the 4 scale scored from the 50 4 items in the 
MMPI and each of the MMPI scales( with 
the exception of the Lie and the Question 
scales) was computed on the basis of raw 
scores for each of two groups.? The first 
group consisted of 106 male patients of a Vet- 
erans Administration neuropsychiatric hospital 
(sample 1). They constituted the entire pop- 
ulation of usable records in the active files of 
the psychology service of that hospital. The 
modal diagnosis of the group was paranoid 
schizophrenia. The records of 73 male college 
students formed the second or “normal” group 
(sample 2). Scatter diagrams for all correla- 
tions were inspected and judged to be essen- 
tially linear. 


Results and Discussion 


Table 1 gives the correlations between the 
A scale and each of the MMPI scales for the 
two groups. Also indicated are the means and 
standard deviations of the scale score distri- 
butions. 


The most striking feature of the data in 
Table 1 is the extremely high correlation be- 
tween the Taylor anxiety scale and the 


MMPI P? scale. In the hospital sample the 


2We would like to express our appreciation to R. 
E. Billings and Yvonne Brackbill for assistance in 
collecting data and making the calculations. 
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Table 1 


Correlations of the Taylor Manifest Anxiety Scale 
with the Scales of the MMPI for 106 Neuropsychi- 
atric Hospital Patients and 73 College Students 














Patients Students 

Scale Raw score Raw score 

4, Mean SD r4, mean SD 
A — 19.4 11.7 = 11.8 6.1 
F .63 10.6 7.7 19 4.2 2.4 
K —.74 14.1 6.0 —.56 16.0 4.1 
Hs .78 10.5 8.2 .60 4.0 2.8 
D .74 26.0 8.5 38 19.0 4.5 
Hy 50 24.4 7.5 .05 20.3 4.4 
Pd 64 21.4 6.0 .16 15.2 4.2 
Mf 45 25.6 5.5 44 #271 45 
Pa 59 13.6 5.3 15 8.5 2.6 
Pt 92 18.9 11.9 81 10.2 5.3 
Sc 83 22.1 14.8 56 9.3 5.2 
Ma .28 18.0 5.3 10 16.5 4.5 





obtained correlation (.92) exceeds the highest 
reliabilities reported for either of the two 
scales, suggesting that they would make very 
satisfactory alternate forms. The correlation 
between 4 and Pt in the college sample (.81) 
is significantly lower than in the hospital 
group. It is possible, therefore, that among 
college students 4 and Pt are measuring some- 
thing slightly different. 


In order to determine whether the size of 
the correlation between Pt and 4 was the re- 
sult of chance factors, the relationship between 
the two scales in samples from five additional 
clinical populations was computed.* These 
groups were: (sample 3) 60 consecutive ad- 
missions to a general medical hospital; (sam- 
ple 4) 63 cases referred to the psychology sec- 
tion of a Veterans Administration diagnostic 
center; (sample 5) 60 randomly selected cases 
from the files of a university counseling cen- 
ter; (sample 6) 83 consecutive admissions to 
a Veterans Administration tubercular hos- 
pital; (sample 7) 59 additional neuropsychi- 
atric hospital cases. The data thus obtained 
are presented in Table 2. 


In each of the samples the correlation be- 
tween 4 and Pt again exceeds the reported re- 
liabilities of the two scales. The explanation 


8The authors are indebted to Dr. Jerome Fisher, 
VA Hospital, San Francisco; Dr. Roger Bardsley, 
VA Hospital, Livermore; and Dr. John Black, 
Counseling Center, Stanford University, for their 
assistance in compiling these data. 
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can be advanced that the excessively high cor- 
relation is the result of correlated error terms 
in the two scales. They are taken simultane- 
ously so momentary variations in mood and at- 
titude affect each equally; both are imbedded 
in the same common matrix of the remainder 
of MMPI items inducing a similar set; and, 
for the 13 items the scales have in common, 
the identical responses are scored on each. 
Such factors will, of course, have the effect of 
increasing the apparent relationship between 
A and Pt. In terms of the expressed purpose 
of the study, however, the results are quite 
clear: with clinical populations no informa- 
tion would be provided by the scoring of the 
A scale in the MMPI that is not already pres- 
ent in the current battery of scales. It is pos- 
sible that the 4 scale as administered in its 
own set of buffer items might provide useful 
additional information for the clinician since 
Taylor [10] found a correlation of only .68 
between the two forms administered with an 
18-week time interval. However, Deese et ai. 
[2] report a correlation of .81 between the 
Biographical Inventory form of the 4 scale 
and the MMPI Pt scale in a college popula- 
tion when the latter was taken at the same 
time as the former. In a more variable clinical 
population it is reasonable to assume that the 
figure would be even higher perhaps approxi- 
mating those obtained in the present study. 


An additional finding of interest in Tables 
1 and 2 is the remarkably low average 4 
scores of the clinical samples. Taylor reports 
a median score of 34 for 103 neuropsychiatric 
subjects and, although the mean score is not 
given, it can be estimated as about 31 from the 
frequency polygon she presents [10, p. 289]. 
In another report by Spence and Taylor [8] 
the mean 4 score for 55 neurotics and psy- 
chotics is given as “about 32.” In the present 
study the highest average 4 score of any sam- 
ple is 21.9 with a standard error of the mean 
of 1.5. The difference between this figure and 
those obtained by Taylor, and Spence and 
Taylor, is obviously quite significant but no 


ready explanation is apparent for the discrep- 
ancy. 


The comparison of the average MMPI pro- 
files for the high and low scores in both groups 
gives some suggestion as to the meaning of the 





MMPI Correlates of the Taylor Anxiety Scale 


+35 


Table 2 


Correlations Between the Taylor Manifest Anxiety Scale and the MMPI Pi Scale 
of Clinical Populations 


in Five Samples 








Sample 








Population N Ts pt M, SD, Mp; SD p; 
3 General medical patients 60 92 16.1 8.7 11.1 8.1 
4 Diagnostic center patients 63 91 16.1 9.2 12.4 9.6 
5 Counseling center clients 60 .93 12.3 7.7 11.4 7.2 
6 TB hospital patients 83 91 14.6 8.7 13.1 8.2 
7 NP hospital patients 59 92 21.9 11.¢ 


18.5 11.6 





A scores in different populations. Table 3 pre- 
sents the mean raw scores for the highest and 
lowest 10 subjects of the 4-score distributions 
of the college and neuropsychiatric hospital 
groups.* 


Table 3 


Average MMPI Scale Scores of the Ten Highest 
and Ten Lowest A-Scale Cases of the Student and 
Hospital Groups 














Group L F K Hs D Hy Pd Mf Pa Pt Se Ma A 
College 

Low A 4 3 21 216 2114228 83 4 4 3 
College 

High A 4 612 7 238 21 16 29 8 17 138 15 22 
Hospital 

Low A 6 2 21 9 19 20 17 22 9 . 2m = 
Hospital 


33 38 21 39 


High A 3 17 10 21 36 831 26 28 17 





For the high scorers in the hospital, the pro- 
file is an obviously psychotic one although the 
relation of F and K suggests that there is some 
tendency for the subjects to place themselves 
in the worst possible light. Overt anxiety and 
turmoil are present as suggested by the high 
D score and the profile in general suggests 
that the patients recognize their illness and are 
actively seeking help. 

The mean profile of the low 4 scorers of 
the hospital group, on the other hand, suggests 
more of the character disorder. The existence 
of denied anxiety is implied by the elevated K 
score, but the acting out of impulses indicated 
by the peak on Pd reduces much of the ten- 
sion that might produce manifest anxiety 
symptoms. 

In the college group, the higher scorers’ av- 
erage profile suggests that they are introspec- 


*To conserve space, T-score profiles are not pre- 
sented. The interested reader may plot them on the 


oe MMPI scoring sheet using the data of 
Table 3. 


tive, quite sensitive to environmental press, 
and willing to admit to being easily disturbed. 
The.low scorers, conversely, utilize both denial 
and repression fairly frequently and rarely in- 
trospect. 
Summary 

The relationship between the Taylor Mani- 
fest Anxiety Scale, as scored in the MMPI, 
and the other MMPI scales was examined in 
an attempt to discover what additional infor- 
mation would be provided by its use. ‘Iwo 
samples, a college student group and a neuro- 
psychiatric hospital group, were used. On the 
basis of the extremely high correlation between 
the 4 scale and the MMPI Pt scale ( 
it was concluded that manifest anxiety could 
be ascertained from the latter in clinical pop- 
ulations as reliably as from the former. The 
meaning of high and low scores on the 4 scale 
of the two groups was discussed in terms of 
mean MMPI profiles of extreme groups. 


9?) 
-Fa), 
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Regional Differences in MMPI Responses 
Among Male College Students’ 


Leonard D. Goodstein * 


State University of Iowa 


The usefulness of any psychometric instru- 
ment depends, at least in part, upon the appli- 
cability of the available norms. The blind appli- 
cation of a test valid for one population to a 
different type of group often leads to serious 
errors of interpretation. The use of the MMPI 
with college populations has been criticized on 
two points. First, college students tend to be 
more deviant than the general population in 
their MMPI responses and, second, local or 
regional norms are thought necessary for each 
college or geographical region. Brown [2], 
Cottle [4], Gilliland and Colgin [7], and 
others have urged the development of local 
norms while Tyler and Michaelis* among 
others have actually constructed such local 
norms. 

Black [1] has taken a somewhat contradic- 
tory position. He collected, from the literature, 
MMPI data on a large group of college fe- 
males, a total N of 5014, at 15 different col- 
leges and universities. Although he did not use 
an over-all test of significance, he concluded, 
“these data suggest that there is a characteristic 
profile for college women which does not differ 
from college to college. It is certainly true that 
some of the differences are statistically signifi- 
cant, but that they are ef little practical signifi- 
cance” [1, p. 34]. The purpose of the present 
paper is further to test the notion that local or 
regional norms are necessary for interpreting 
MMPI profiles of college students by examin- 
ing the data available on college males. 


_ ‘Based on a paper presented at the annual meet- 
ing of the Midwestern Psychological Association, 
Columbus, Ohio, April 30, 1954. 


*The author wishes to acknowledge his indebted- 
ness to Robert A. Johnston for his assistance with the 
Statistical analysis of these data. 


°F. T. Tyler & J. U. Michaelis. University stu- 
dent norms for the Minnesota Multiphasic Person- 
ality Inventory. Unpublished manuscript, School of 
Education, University of California, Berkeley. 


Procedure 


One group of Ss, consisting of 408 randomly 
selected freshman males, had been tested at the 
State University of Iowa using the booklet 
form of the MMPI. From the published liter- 
ature and from those unpublished studies avail- 


able to the author, seven additional groups 


were selected as meeting the following criteria: 


a. Ss randomly selected or 100 per cent sampling 
technique used. The inclusion of data that had beer 
obtained through administration to students in sp: 
cial curricula or to preselected groups of student 
e.g., student leaders, might lead to a confounding of 
the results and were not included. 

b. N>100. Relatively large samples were seem 
ingly more free of bias. 

c. Means and standard deviations reported for all 
nine clinical scales with K correction where approp 
riate. As the K correction is used 
those studies without the K correction were not it 
cluded. 


now routinely 


Results 


The means, the standard deviations, and the 
number of cases on the nine clinical scales and 
the K scale for the eight schools are presented 
in Table 1. The agreement among the scale 
means of the eight samples is very striking. ‘The 
largest difference between any two means on 
a single scale is 7.8 T-score units on the Mf 
scale. With the exception of the differences on 
the Mf scale, none of the differences exceeds 
4.9 T-score units, which would be less than one- 
half of a standard score according to the origin- 
al standardization data. ‘The smallness of the 
differences using raw scores is even more strik- 
ing. Only the Mf scale has a difference be- 
tween any two means of over 4 raw score units. 
The median range difference in raw score 
units is only 1.9. 

The data from these eight colleges were 
then placed into three groups based upon geo- 
graphical regions: the Eastern group, including 
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Table 1 


MMPI Means and Standard Deviations in T Scores for 





Each College Arranged According to Geographical Region 


























College N K Hs* D Hy Pd* Mf Pa Sc* Ma® 
East 
Maine 316 M T 53.0 53.0 55.0 55.0 59.0 50.0 56.0 55.0 60.0 
[10] SD tT 8.5 12.1 P| 10.8 10.6 9.1 12.6 12.2 11.1 
Penn. State i2i M , tee. ass S63 57.0 59.6 52.2 58.0 57.1 60.2 
[3] SD T 8.3 11.8 7.7 9.5 8.6 9.6 10.8 8.9 9.6 
Midwest 
lowa 408 M 15.3 53.0 53.0 56.0 57.0 58.0 54.0 56.0 57.0 58.0 
SD 4.8 8.2 11.6 8.2 9.8 10.8 7.9 12.8 11.6 11.0 
Minnesota 1321 M 144 51.1 51.8 54.2 56.6 56.9 52.6 56.8 56.9 58.1 
[11] SD 4.6 8.1 10.3 7.3 9.7 9.7 8.6 10.4 10.8 10.3 
Wisconsint 1422 M 145 51.2 52.2 55.0 56.0 59.0 53.0 56.7 56.0 58.0 
SD 4.4 7.2 10.5 7.7 9.7 10.6 7.9 9.4 10.1 10.6 
West 
Montana Statet 456 M 14.2 50.5 52.5 54.5 56.0 58.0 54.1 56.9 56.9 59.0 
SD 4.5 10.0 12.1 7.8 9.6 10.2 8.0 10.1 8.9 9.6 
New Mexico 149 M 164 54.1 53.9 57.8 58.1 62.7 53.4 56.8 57.2 59.2 
[9] SD 94 8.7 10.5 7.8 10.4 9.9 7.8 10.2 10.5 10.0 
Utah State 842 M 13.8 51.5 50.2 §2.9 54.3 54.9 53.0 53.6 52.4 55.8 
[5] SD 4.4 7.1 9.9 7.9 8.8 9.1 9.0 9.5 10.7 9.2 
Median 5035 M 14.5 52.3 52.8 55.0 56.3 58.5 53.0 56.7 56.9 58.7 
SD 4.6 8.3 11.1 7.8 9.8 10.1 8.3 10.3 10.8 10.2 
* Includes K correction. 
+ K added but its value not reported. 


t From an unpublished paper by Donald P. Hoyt, Student Counseling Bureau, University of Minnesota. 
The present author wishes to thank Dr. L. E. Drake and Dr. G. A. Renzaglia for permission to publish the 


Wisconsin and Montana State data. 


the University of Maine and the Pennsylvania 
State College; the Midwestern group, includ- 
ing the Universities of Iowa, Minnesota, and 
Wisconsin; and the Western group, including 
Montana StateCollege, Utah State College, and 
the University of New Mexico. To provide an 
over-all test of significance these data were 
analyzed by a Lindquist [8] Type I analysis 
of variance design in which the regions were a 
“between Ss” factor, while the nine clinical 
scales and the regions X scales interaction were 
“within Ss” factors. As shown in Table 2, only 














Table 2 
Analysis of Variance of Means 
Source df MS azn 
Between Ss 7 11.37 
Regions 2 1.69 
Error (5) 5 15.25 
Within Ss 64 67.55 
Scales 8 46.71 56.72°* 
Regions X Scales 16 1.45 1.63 
Error (qw) 40 89 
Total 71 
** p< 01. 


the obtained F for the scales was significant, 
indicating that the means on the nine clinical 
scales were not equal. The obtained F’s for 
regions and the regions scales interaction did 
not exceed unity. There is no evidence to sup- 
port the notion that geographical differences 
are significant determinants of MMPI means. 


Table 3 


Analysis of Variance of SD’s 


Source df 

















MS F 

Between Ss 7 2.34 

Regions 2 2.13 

Error (5) 5 2.43 
Within Ss 64 1.87 

Scales 8 10.81 18.02** 
Regions X Scales 16 57 

Error (qw) 40 .60 
Total 71 

**p < .01 


To test the importance of these geographical 
regional influences, a similar analysis was then 
performed with the standard deviation data. 
The variability data were also divided into 
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three groups in the same manner as the mean 
data and the identical analysis of variance de- 
sign was used.* 

As may be seen in Table 3, only the ob- 
tained F for scales was statistically significant, 
which would indicate that the variability of 
the nine clinical scales was not equal. The D 
and Pt scales have the largest variances while 
the Hy, Hs, and Pa scales have the smallest 
variances. The F’s for regions and the regions 
X scales interaction were less than unity. On 
the basis of this analysis, therefore, we may 
conclude that the variances are not significant- 
ly dependent upon regional influences. 

Discussion 

An inspection of the means for the eight 
schools reveals that no one group was con- 
sistently higher or lower than any other group. 
The patterning of scores would seem to be 
more a function of chance fluctuation than due 
to the operation of any cultural or environ- 
mental factors. The obtained differences over 
regions would seem to be of such little conse- 
quence that the development of regional norms 
seems unnecessary. 


It is important to note that the mean T 
scores on all nine clinical scales are above the 
expected mean value of 50. This finding is in 
support of the notion that college students, as 
a group, are more deviant in their responses to 
the MMPI than the general adult population 
used in the standardization of the instrument. 
These results, however, should not be inter- 
preted to mean that the MMPI cannot be use- 
ful in evaluating the adjustment of college stu- 
dents, but rather supports the idea that separate 
norms for college students as a group are not 
only desirable but essential. 

In this regard it is interesting to note that, 
using 70 as the cutting score, approximately 
15% of our total group of 5000, or 750, have 
“abnormal” scores on the Ma scale. Extending 
this criterion to the other scales, we identify 
approximately 7 to 10% of our group as ab- 
normal on each of the Pd, Mf, Pt, and Sc 
scales. Without allowing for any overlap, that 

*While such data do not usually fulfill the as- 
sumptions about normality necessary for an analysis 
of variance, recent empirical evidence would indi:- 


cate that slight deviations from normality are of 


—¥ consequence and may be ignored [8, pp. 78- 
90}. 


is, a single § scoring above 70 on more than 
one scale, we would have identified approxi- 
mately half of our 5000 male college students 
as abnormal by such a procedure. While 
this, of course, may be a valid “diagnosis,” the 
capacity of our college mental hygiene facilities 
certainly demands a more rigorous screening 
instrument. Seemingly the usefulness of the 
MMPI as a screening test in the collegiate 
setting would depend upon the development of 
new cutting scores. 

The consistency in the pattern of scores 
among the eight schools also seems worthy of 
note. The peaks on Ma, Mf, Sc, Pt, Pd, and 
Hy suggest that there is a characteristic pro- 
file for college males. They appear to be more 
feminine in their interests, to be more active, 
less inhibited, but more worrying than the male 
population in general. This diagriostic picture 
would seem to be validated by clinical obser- 
vations in the classroom and on the campus as 
well as in the counseling office. 

We may then compare our typical profile for 
college males with that obtained by Black [1] 
for college females. This comparison is shown 
in Figure 1. In each case the points are based 
upon the median mean, in our study the median 
is of 8 colleges, in Black’s study it is of 15 
colleges. The total N’s, however, are very simi- 
lar, 5035 for our sample, 5014 for Black’s. 

The differences between these two profiles 
are rather interesting. The males are higher 
on all the clinical scales with the exception of 
Pa where the points almost coincide. The 
largest differences are on the Mf, Ma, Pt, D, 
and Hs scales. The women are below the ex- 
pected mean value of 50 on the Hs and D 
scales while the men, as noted before, are con- 
sistently above 50. The women are higher on 
the K scale, indicating a somewhat more de- 
fensive attitude. The effect of this higher K 
value, however, is to reduce the differences be- 
tween the sexes on K-corrected scales as the 
higher K results in higher values on five of the 
nine scales. 


While it may not be obvious upon first 
glance, there is a good bit of similarity between 
these two profiles. If we exclude the Mf and 
Pa scales from consideration, the profiles then 
appear quite similar with peaks on Ma and Sc 
and valleys on Hs and D, although the fe- 
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Fig. 1. Comparison of the MMPI profiles of male and female college students. 


male profile is, of course, lower. It is rather 
surprising to note that, while college males 
are considerably more feminine in their in- 
terests than males in general, college females 
are not more masculine in their interests than 
women in general. The reasons for this finding 
and the possible implications for the future ad- 
justment of these individuals are left to con- 
jecture. The differences between these profiles 
would suggest that, in using the MMPI in a 
college setting, separate norms are necessary 
for males and females. 


Summary 


The present study has been a comparison of 
the results of MMPI administration to large 
groups of male undergraduates at eight dif- 
ferent colleges and universities for regional dif- 
ferences. When the mean scores were subjected 
to an analysis of variance, the results indicated 
that there were no significant regional differ- 
ences. An analysis of variance for the standard 
deviations also showed no significant regional 
differences. The obtained differences would 
seem to be of such little consequence that the 
development of regional or local norms seems 


unnecessary. 


While there is a characteristic profile for 
the college male that differs little from college 
to college, it is markedly different from the 
characteristic profile of the noncollege male 
and from the characteristic profile of the col- 
lege female. New norms would appear neces- 
sary in using the MMPI as a screening test in 
clinical work with university populations. 


Received May 10, 1954. 
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Social Desirability as a Factor in Edwards’ 
Personality Preference Schedule Performance 


Leslie Navran and James C. Stauffacher 
VA Hospital, American Lake, Washington 


The Edwards Personality Preference Sched- 
ule (EPPS) is a new personality inventory 
which yields scores for 15 of the needs in Mur- 
ray’s theoretical system. ‘These are: achieve- 
ment, deference, conjunctivity, exhibition, au- 
tonomy, affiliation, intraception, succorance, 
dominance, abasement, nurturance, change, en- 
durance, heterosexuality, and aggression. In 
constructing the schedule, Edwards has utilized 
the knowledge that there is a linear relation- 
ship between the probability of endorsement of 
an item and its judged social desirability (r = 
.871). Thus, the structure of the instrument is 
such that for each of the 225 items, the sub- 
ject has to choose between two statements 
which are matched for social desirability, des- 
ignating his choice as being more character- 
istic of him. 

This paper presents data from a project de- 
signed to evaluate the usefulness of the EPPS 
for counseling and guidance in the field of 
nursing. 

The EPPS was administered to 25 student 
affiliate nurses spending three months in resi- 
dence at VA Hospital, American Lake, Wash- 
ington. A month later, the girls separately 
ranked cards which bore the names and defini- 
tions of the needs sampled in the EPPS so that 


‘An extended report of this study may be ob- 
tained without charge from Leslie Navran, VA Hos- 
pital, American Lake, Washington, or for a fee from 
the American Documentation Institute. To obtain it 
from the latter source, order Document No. 4395 
from ADI Auxiliary Publications Project, Photo- 
duplication Service, Library of Congress, Washing- 
ton 25, D.C., remitting in advance $1.25 for micro- 
film or $1.25 for photocopies. Make checks payable to 
Chief, Photoduplication Service, Library of Con- 
gress. 


they ranged from most to least characteristic 
of themselves. Then, they ranked the 15 needs 
again from most to least socially desirable. 

Composite ranks were made of the 15 needs 
from the EPPS raw scores, from the students’ 
self-descriptions, and finally from their judg- 
ments of social desirability. —The composite 
EPPS raw-score ranking correlated —.03 with 
the self-description composite, and —.01 with 
the social desirability composite. However, the 
self-description rankings correlated .90 with 
social desirability judgments! An analysis was 
made of the same correlations for each indi- 
vidual. While none of the 25 correlations be- 
tween EPPS scores and social desirability judg- 
ments was significantly different from zero, 15 
of the correlations between self-descriptions 
and social desirability were significantly dif- 
ferent from zero at the .05 level (p = .44 or 
more). At the .01 level, 13 correlations were 
above the required value of .59 (one-tailed 
test ).* 

These results lend impressive support to the 
proposition that the EPPS not only has elimin- 
ated the influence of social desirability at the 
item level, but has also considerably reduced its 
operation at the variable level. Furthermore, 
it is felt that the criterion of self-description 
has been shown to be so heavily influenced by 
social desirability considerations as to be use- 
less. More objective criteria are needed to at- 
tack fruitfully the issue of validity. 


2This study has since been replicated with a 
second group of 27 affiliate nurses, with almost iden- 
tical results. 


Brief Report 
Received July 20, 1954. 
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The Relation Between Acceptance of Self and 
Acceptance of Others Shown by Three Personality 
Inventories’ 

Katharine T. Omwake 


Agnes Scott College 


In recent years psychiatrists and clinical 
psychologists have observed a relation between 
the attitude toward the self shown by the pa- 
tient and his attitude toward other people. Ad- 
ler [1] noted a depreciation of others in those 
who themselves felt inferior. Horney [5] as- 
serted that the person who does not love him- 
self is incapable of loving others. Rogers [8] 
said that the person who accepts himself will 
for that very reason have better interpersonal 
relations with others. He also observed that, 
during therapy, as a person begins to accept 
himself he becomes capable of experiencing 
this attitude toward others. Changes in accept- 
ance of the self and correlated changes in the 
acceptance of others occurring during client- 
centered therapy were studied by Sheerer [9]. 
Her study, and a similar one by Stock [10] al- 
so based on a small number of counseling cases, 
showed that perceptions of others, feelings to- 
ward others, and acceptance of others are sig- 
nificantly related to the perception of the self, 
feelings about the self, and acceptance of the 
self. These studies show that, in the course of 
therapy, with the increase in self-acceptance 
there is a corresponding increase in favorable 
attitude toward others. 

Several attempts have been made to see to 
what extent these observations made by clini- 
cians hold true for larger, more normal popu- 
lations. Phillips [6] constructed a question- 
naire on Attitudes Toward the Self and Others 
which he administered to several groups of stu- 
dents. In a university class composed chiefly of 
older students the correlation between attitudes 
toward the self and those toward others was 
.74, while with a younger group of college age 
it was .54. Berger [2] used the definitions of 

*This article is an abbreviated version of a paper 


read at the meeting of the Southern Society for 
Philosophy and Psychology in Atlanta, April 1954. 


acceptance of self and acceptance of others 
which Sheerer [9] had made as the basis for 
the development of a questionnaire or scale. He 
found a correlation of .65 between acceptance 
of self and acceptance of others for evening 
students and of .36 for day students. This is 
in agreement with the study by Phillips [6], 
who also found a closer relation between ac- 
ceptance of self and acceptance of others in the 
older group of college students than in the 
younger group. Bills, Vance, and McLean [4] 
devised an Index for Adjustment and Values 
which measures self-acceptance, and a corres- 
ponding form which shows the individual’s 
perception of how other people accept them- 
selves. 

Phillips [6], Berger [2], and Bills, Vance, 
and McLean [4] have published studies of the 
reliability and validity of their tests, and the 
results obtained from administration of them 
to college students and to some other groups, 
but they have not published the tests them- 
selves. In the present study all three of these 
unpublished tests, by permission of their au- 
thors, have been administered to a group of 
over one hundred college students. 


Problem 


This study is designed to test the assumption 
that, in a normal population, there is a positive 
relation between the acceptance of self and the 
acceptance of others, that those who are self- 
acceptant are also acceptant of others, and that 
those who reject themselves also tend to reject 
others. 

A second hypothesis is that there should be 
agreement among tests designed to measure 
the same trait; that, therefore, various tests of 
acceptance of self should give similar results, 
and that various measures of acceptance of 
others, likewise, should agree. 
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Method 


Three unpublished personality inventories, 
made available to the investigator by their au- 
thors, were used in this study. They are: the 
scale for Self-Acceptance and Acceptance of 
Others by Berger [2], the questionnaire on At- 
titudes Toward the Self and Others by Phillips 
[6], and the Index of Adjustment and Values 
for self and for others, by Bills, Vance, and 
McLean [4]. Each was mimeographed with- 
out a title, so there was nothing to indicate 
what the inventory was designed to measure. 
These three personality inventories, described 
below, were administered to 113 students in 
the first course in psychology at Agnes Scott 
College. The students were told that the in- 
vestigator wished to make a study of the per- 
sonality tests, and that in order to make the 
study it was not necessary to know the identity 
of those taking the tests. Each student was as- 
signed a number which she placed on each test. 
It was hoped that this method of preserving an- 
onymity would increase frankness in answering 
the questionnaire. 


The scale for measurement of Self-Acceptance and 
Acceptance of Others by Berger consists of 36 items 
concerned with acceptance of self and 28 with the 
acceptance of others, mixed in random order. The 
following is typical of the statements showing atti- 
tude toward the self: “I realize that I’m not living 
very effectively but I just don’t believe I’ve got it in 
me to use my energies in better ways.” “I believe 
that people should get credit for their accomplish- 
ments, but I very seldom come across work that de- 
serves praise” shows attitude toward others. The 
person taking the test is to rate each item on a five- 
point scale. 


The Attitudes Toward the Self and Others ques- 
tionnaire by Phillips is basically similar to Berger’s 
scale. It consists of 25 statements showing attitudes 
about the self and 25 showing attitudes toward 
others, mixed in random order. The person taking 
the test is to rate attitude on each statement on a 
five-point scale, 


The Index of Adjustment and Values by Bills, 
Vance, and McLean consists of 49 adjectives, such as 
acceptable, busy, calm, poised, and tactful. The sub- 
ject is asked to use each word to complete the sen- 
tence “I am a (an) ................ person,” using a five- 
point scale to indicate how much of the time this is 
like him, and then to indicate, by rating, how he 
feels about himself as described in the first rating. 
The sum of these second ratings gives the measure 
of self-acceptance used in this study. The Index for 
others is similar. The subject uses the same 49 ad- 
jectives in turn to complete the statement “Other 
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people are ................ persons,” with friends particu- 
larly in mind, and indicates how other people feel 
about themselves as they are, using a five-point 
scale as before. This gives a measure of the extent 
to which others are seen as accepting themselves. 


Results and Discussion 


The three inventories measuring self-accept- 
ance agree markedly, as shown by the correla- 
tions in Table 1. As might be expected, there 
is closest agreement between the tests which 
are most similar in form and content. Degree 
of acceptance of self given by Berger’s scale 
correlates .73 with degree of self-accept- 
ance in Phillips’ scale. Bills’s Index, which 
measures self-acceptance by a different tech- 
nique, is in less close agreement with the other 
two scales, although even here the correlation 
of .49 with Berger’s scale and .55 with Phil- 
lips’ indicates a substantial correspondence. All 
three correlations are significant at the 1 per 
cent level of confidence. 


Table 1 


Correlations of Measures of Acceptance of Self 
and Acceptance of Others 








Level of 





Tests Correlation  signifi- 
cance 
Acceptance of self vs. 
acceptance of self 
Berger vs. Phillips .73 1% 
Bills vs. Berger 49 1% 
Phillips vs. Bills 55 1% 
Acceptance of others vs. 
acceptance of others 
Berger vs. Phillips -60 1% 
Bills vs. Berger .23 5% 
Phillips vs. Bills As - 
Acceptance of self vs. 
acceptance of others 
Berger vs. Berger 37 1% 
Bills vs. Bills 39 1% 
Phillips vs. Phillips 41 1% 
Berger vs. Phillips .25 5% 
Berger vs. Bills 23 5% 
Bills vs. Berger .23 5% 
Bills vs. Phillips 18 - 
Phillips vs. Bills .30 1% 
Phillips vs. Berger 34 1% 





There is slightly less agreement among the 
tests showing attitudes toward others. Here al- 
so the two tests which are similar in construc- 
tion agree most closely. The correlation be- 
tween the tests by Berger and Phillips is .60, 





Acceptance of Self and Acceptance of Others 


significant far beyond the 1 per cent level, but 
lower than the correlation of .73 between the 
tests of acceptance of self by the same authors. 
Since Bills’s Index for others shows the indi- 
vidual’s perception of how other people accept 
themselves, rather than the degree to which he 
is acceptant of others, it is understandable that 
it correlates less well with tests measuring the 
individual’s attitude toward others. Bills’s In- 
dex correlates .23 with Berger’s scale, signifi- 
cant only at the 5 per cent level, while the cor- 
relation of .13 with Phillips’ scale is not sta- 
tistically significant. 


There is a consistent tendency for those who 
accept themselves to be acceptant of others and 
to view others as being self-acceptant, and, on 
the other hand, for those who have a low opin- 
ion of themselves to reject others also, and to 
see others as rejecting themselves. This is seen 
in the correlations of .37, .39, and .41, signifi- 
cant at the 1 per cent level, between the accept- 
ance of self and attitudes toward others as 
measured by the inventories by Berger, Bills, 


and Phillips. 


These correlations are similar to those ob- 
tained by Berger [2] and by Phillips [6] in 
groups of the same age as the group in this 
study. Berger’s correlation is .37 as compared 
with .36 in the present study; Phillips’ corre- 
lation of .54 shows a closer relationship than 
does the correlation of .41 obtained with the 
present group. Bills has not reported the corre- 
lation between self-other attitudes, so no com- 
parison is possible. 


As might be expected, each test of self-ac- 
ceptance agrees more closely with the corres- 
ponding test of attitudes toward others made 
by the same author than it does with the scale 
for others made by another author. In every in- 
stance the correlation between the self-other 
parts of the same test is significant at the 1 per 
cent level of confidence, whereas Phillips’ test 
of self-acceptance is the only one which gives 
correlations of this degree of statistical signifi- 
cance with tests of attitudes toward others made 
by another investigator. Three of the four other 
correlations between self-attitudes measured by 
one test and attitudes toward others obtained 
from another test are significant at the 5 per 
cent level. As shown by the correlations in 
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Table 1, there is a consistent tendency for self- 
attitudes to be reflected in attitudes toward 
others. 


A comparison was made of the attitude 
toward others held by those who represented 
the extremes of self-acceptance and of self-re- 
jection. The self-acceptant group consisted of 
19 subjects who were in the upper three deciles 
on all three tests of self-acceptance; the group 
who rejected themselves was composed of 12 
individuals who were consistently in the lowest 
three deciles on all three tests. ‘The larger num- 
ber in the self-accepting group shows that a 
high degree of self-acceptance is more consist- 
ently shown than is a low degree; only a small 
number are extremely self-rejectant on all 
three tests, while a larger number consistently 
hold themselves in high esteem. On each test 
the mean score on attitude toward others made 
by the self-acceptant group is higher than the 
mean score made by the self-rejecting group. 
The difference between the means is signifi- 
cantly different from zero well beyond the | 
per cent level of confidence. This difference 
between the means on the tests by Berger and 
by Phillips indicates that those who are most 
self-acceptant also have a high degree of ac- 
ceptance of others, while those who reject 
themselves also hold others in low esteem. The 
difference between the means on Bills’s Index 
would indicate that those who are self-accept- 
ant perceive others as accepting themselves, 
while those who are least self-acceptant per- 
ceive others as rejecting themselves. As had 
been assumed, there is a direct relation be- 
tween the attitude an individual holds toward 
himself and that held toward others. 


The data presented here, although based on 
personality inventories that do not probe very 
deeply into personality dynamics, verify in a 
fairly large group of college students the con- 
clusions derived by clinicians from therapy 
sessions. There is evidence that in the normal 
population, as well as in those undergoing ther- 
apy, attitudes toward the self appear to be re- 
flected in attitudes toward other people: the 
lower the opinion of the self, the lower the 
opinion of others. Only when the self is re- 
garded with a fairly high degree of acceptance 
is it possible to relate effectively to others, to 


———— ee 
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understand them, and to regard them as per- 
sons of worth. 


Summary 


To test the hypothesis that in a normal pop- 
ulation there is a positive relation between ac- 
ceptance of self and acceptance of others, three 
unpublished tests measuring attitudes toward 
the self and toward others were administered 
to 113 college students who took them anony- 
mously. The tests are: the scale for Self-Ac- 
ceptance and Acceptance of Others by Berger, 
the questionnaire on Attitudes Toward the Self 
and Others by Phillips, and the Index of Ad- 
justment and Values by Bills, Vance, and 
McLean. The three measures of self-accept- 
ance agree closely; those for attitudes toward 
others agree less well. The results support the 
hypothesis in that there is a marked relation be- 
tween the way an individual sees himself and 
the way he sees others ; those who accept them- 
selves tend to be acceptant of others and to 
perceive others as accepting themselves; those 
who reject themselves hold a correspondingly 
low opinion of others, and perceive others as 
being self-rejectant. 


Received March 30, 1954. 
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Self-Acceptance and Its Relation to Conflict 


Herbert Zimmer 


Air Force Personnel and Training Research Center 


Raimy [4] was able to show a shift in self- 
evaluation in successfully counseled clients, 
which did not occur in unsuccessfully coun- 
seled clients. The shift was one from self-dis- 
approval to self-approval. Following his lead 
Bills, Vance, and McLean [2] defined personal 
maladjustment “as any discrepancy between 
the concept of self and the concept of the ideal 
self.” Their Index of Adjustment and Values 
uses the total of the discrepancies between the 
self-concept and the concept of the ideal self 
as a measure of adjustment. Roberts [5] 
demonstrated significantly longer reaction 
times on a word association test for words with 
self-concept—ideal-self discrepancies than for 
words without discrepancies, and considered 
such discrepancies a “valid index of emotion- 
ality.” Bills [1], upon repeating Roberts’ ex- 
periment with the identical method, found a 
trend in the direction of Roberts’ results. 

The present study was undertaken to check 
the efficacy of self-concept—ideal-self discrep- 
ancies as indicators of conflict, and by infer- 
ence of maladjustment. It tested the hypothesis 
that the presence of conflict over a personality 
trait is associated with a self-concept—ideal-self 
discrepancy on that trait. A confirmation of 
this hypothesis would support Roberts’ findings 
[5], and provide a direct rationale for the use 
of instruments employing self-concept—ideal- 
self discrepancies as a measure of adjustment, 
and of improvement in counseling situations. It 
should be noted that this experiment was car- 
ried out prior to the publication of Roberts’ 
[5] and Bills’ [1] articles, and has therefore 
approached the same problem independently 
and by means of a different method. 


Procedure 


Fifty-two subjects rated themselves (a) as 
they are, and (4) as they would like to be with 


respect to 25 personality traits on two similar 
seven-point rating scales. By using the trait 
adjectives as stimulus words in a word associ 
ation and reproduction test, six of the tradi- 
tional complex indicators were obtained, and 
employed as an index of conflict. 

The subjects were 26 male undergraduate 
students at the University of Rochester and 26 
male neuropsychiatric patients at the Canan- 
daigua, New York, Veterans Administration 
Hospital. A specially constructed 20-item vo- 
cabulary test was used to screen all subjects 
with an inadequate understanding of words of 
the level of difficulty used in the experiment. 

The 25 trait adjectives selected for use in 
the experiment were: leisurely, emotional, am- 
bitious, deliberate, trusting, sentimental, cau- 
tious, refined, energetic, orderly, poetic, wary, 
lusty, dominant, spontaneous, economical, re- 
spectful, daring, precise, meek, persistent, ar- 
dent, conventional, determined, and obedient 
They were chosen on the basis of (a) occur- 
ring 6 to 10 times in one million words in the 
Thorndike-Lorge word count [7] column G, 
(6) not representing a cultural stereotype of a 
desirable or undesirable personality trait, and 
(c) having personal and dynamic meaning. 
Word frequency was controlled because of its 
possible effect on word association reaction time. 
Stereotyped traits were eliminated, since con- 
sistently extreme ratings might affect the size 
of self-concept—ideal-self discrepancies. The 
order of presentation of stimuli was varied 
systematically to counteract position effects. 


Each trait was evaluated by the subject on 
two seven-point rating scales. The self-concept 
scale ran from “7. I am a very 
to “7. I am definitely not a. . . person.” The 
ideal-self scale ran from “7. I would like to be 
a very ... person.” to “J. I definitely don’t 
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want to be a. . . person.”” The complete list 
of traits was first rated on one scale, and later 
on the other scale. 

The complex indicators in the word associa 
tion test, which were used as the index of con 
flict, were (a) long reaction time, (4) long 
reproduction time, (c) defective reproduction, 
(d) repetition of the stimulus word, (¢) re- 
sponding with more than one word, and (f) 
obvious and clear-cut manifestations of overt 
emotional behavior. Reaction time was re- 
corded with a Standard Electric Timer. Three 
of the indicators are identical to the complex 
indicators in Hull and Lugoff’s [3] list which 
show the highest positive correlation with the 
combination of all the complex indicators used 
by them. Each of the six indicators was weight- 
ed equally, since no empirical system of as- 
signing differential weights, which would des- 
ignate the relative merit of the various indi- 
cators in signifying conflict, has yet been 
devised. 

Table 1 indicates that traits with zero com- 
plex indicators, as well as traits which evoked 
two or more complex indicators, differentiated 
successfully between two groups of distinctly 
different emotional adjustment, while traits 
with one complex indicator did not. For this 

Table 1 
Differences between Hospital and College Groups in 
Mean Number of Traits Eliciting 
Complex Indicators 
Subjects 
Hospital College 
Statistic group group Difference 
(N = 26) (N = 26) 





Traits with 0 indicators 


























Mean 6.23 8.27 2.04 

SD 2.19 1. 

t 3.505 

p 001 
Traits with 1 indicator 

Mean 5.69 623 54 

SD 2.18 2.66 

t . 785 

p { >.20 
Traits with 2 }:dicators 

Mean 13.08 “10.50 2.58 

SD 2.15 2.21 

t ¢ 4.188 

p <.001 





reason a nonconflictual trait was defined as one 
which did not elicit any complex indicators, 
and a conflictual trait as one which evoked 


two or more complex indicators. 


‘The data given in lable 1 can only be con 
sidered suggestive, since the two groups com 
pared in it were not matched. It does, however, 
give some indication of the relative efhicacy ot 
traits with varying numbers of complex indi 


cators as indices of conflict. 


Results 


Table 2 shows that no differences in the size 
of self-concept —ideal-self discrepancies were 
found between conflictual and nonconflictual 
traits. 

‘Table 2 


Self-Concept—Ideal-Self Discrepancie for 


’ 
Conflictual and Nonconflictual Trai 


J raits 
Statistic Noncontfilict Conflict Difference 


All subjects N 2 


Mean 1.20 1.1 ‘ 
SD J1 60 37 
t 431 
, , 


Hospital group (\ 26 


Mean 1.30 1.20 Al 
SD 82 6 68 
i 482 


p >.05 
College group (N —= 26) 


Mean 1.11 1.10 01 


SD 39 48 54 
i 095 
p > .05 


A sign test was also done on these data. 
With the number of cases at 51 (one case was 
a tie), and the number of the less frequent 
sign at 21, no significant difference was found 
between conflictual and nonconflictual traits 
with respect to discrepancy scores. No signifi- 
cant difference exists between the two groups 
with respect to the difference between the 
scores on conflictual and nonconflictual traits. 


The results do not support the hypothesis 
that the presence of conflict over a personality 
trait is associated with a discrepancy between 
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the concept of self and the concept of the ideal 
self. Consequently, these results are not in 
agreement with the findings of Roberts [5] 
and Bills [1]. However, they are in line with 
a recent study by Smith [6], who reports that 
self-concept—ideal-self discrepancies have no 
discernible effect on ease of perceiving, learning, 
and remembering of trait adjectives. Perhaps 
future studies will be able to resolve existing 
differences. 


Summary 


This study was designed to test the hypothe- 
sis that the presence of conflict over a person 
ality trait is associated with a discrepancy be 
tween the concept of self and the concept of the 
ideal self. A confirmation of this hypothesis 
would provide a rationale for instruments em 
ploying self-concept—ideal-self discrepancies as 
a measure of adjustment, and of improvement 
in counseling situations. Fifty-two subjects 
rated themselves (a) as they are, and (+) as 
they would like to be with respect to 25 per- 
sonality traits on two similar seven-point rating 
scales. By using the trait adjectives as stimulus 
words in a word association and reproduction 
test, six of the traditional complex indicators 
were obtained, and employed as an index of 


conflict. The results do not substantiate the 
hypothesis, and thereby fail to support the con 
tention that discrepancies between the concept 
of self and the concept of the ideal self are di 


rectly indicative of conflict. 


Receiwed April 29, 1954 
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A Note on Consistency of Rigidity as a 
Personality Variable’ 
Hermann O. Schmidt, Charles P. Fonda, 
and Elizabeth L. Wesley’ 


Norwich State Hospital 


Luchins [1] has asserted that problem-solv- 
ing rigidity is “‘not a function of the personality 
per se, but of particular field conditions.” On 
the other hand, findings reported by others sug- 
gest that consistent differences in this kind of 
rigidity do exist among individuals. To test 
the hypothesis of consistency in the tendency to 
adhere to induced behavior after it ceases to rep- 
resent the most direct path to a goal, two dif- 
ferent tests purporting to measure this type of 
rigidity were administered to 93 student nurses. 
One test, the Wesley Rigidity Scale [3] samples 
via self-report a variety of behaviors regarded 
by psychologists as manifestations of rigid be- 
havior. The other test, designed by Luchins [2], 
involves solutions (to water-jar problems) that 
give varying evidence of problem-solving (Ein- 
stellung) rigidity. 

Subjects were classified into three categories 
of rigidity according to the degree of Einstel- 
lung effect: I. Minimal — All possible solu- 
tions by a short method. II. Equivocal — No 


1An extended report of this study may be obtained 
without charge from Hermann O. Schmidt, Nor- 
wich State Hospital, Norwich, Connecticut, or for a 
fee from the American Documentation Institute. To 
obtain it from the latter source, order Document No. 
4396 from ADI Auxiliary Publications Project, Pho- 
toduplication Service, Library of Congress, Washing- 
ton 25, D.C., remitting in advance $1.25 for micro- 
film or $1.25 for photocopies. Make checks payable 
to Chief, Photoduplication Service, Library of Con- 
gress. 


2Now at the University of Louisville, Medical 


spontaneous shift to the short method, but con- 
sistent use of this method after demonstration. 
Included in this group also were subjects who 
either failed to solve one of the “set” problems 
or solved any of them in an atypical fashion. 
III. Maximal — No spontaneous shift to the 
short method, with reversion to the longer 
method at least once after demonstration of the 
shorter technique. 


On the Wesley scale, the mean for 22 sub- 
jects in Group I was 19.9 (SD = 4.89), 
whereas for 23 subjects in Group III the mean 
was 23.35 (SD = 4.43). This difference is 
significant at the .01 level (one-tailed test) 
with ¢ = 2.42. Differences between means for 
Groups I and II (¢= 1.0) and between 
Groups II and III (¢ = 1.7) are insignificant. 
Thus the hypothesis is supported that rigidity, 
as here defined, is a consistent personality trait. 


Brief Report 
Received July 6, 1954. 
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Reading skill is so central in our culture that 
much effort has gone into studies of reading in 
the schools. One aspect of this literature con- 
siders the relationship of reading performance 
to personality organization. From this literature 
one must conclude that, among the multiple 
causes of reading difficulty, emotional disturb- 
ance should be postulated as a relevant factor 
in some cases. Many studies which affirm this 
view are to be found [3, 4, 5, 6, 7, 9, 11]. 

If we entertain the hypothesis that there is a 
meaningful relationship between personality 
disturbance and reading difficulty, it follows 
as a corollary that experiences which are de- 
signed to modify feelings and attitudes in a 
therapeutic way should also modify reading 
performance. Bills [1] has shown that when 
play therapy sessions were held with retarded 
readers, greater reading gains accrued than 
during a comparable no-therapy period. In a 
companion study conducted with well-adjusted 
retarded readers [2], he found that play ther- 
apy made no difference in reading performance. 
One can conclude that therapy is relevant to 
reading gain primarily where emotional factors 
may be at issue, but that therapy will not help 


performance when emotional factors are ruled 
out. 


The study to be reported here takes these 
considerations as a point of departure. Specifi- 
cally, we wish to test the hypothesis that a ther- 
apeutic approach to teaching will yield signi- 
ficant changes in personality and in reading 
performance. 


Procedure 


Overview of design. In order to test the 
foregoing hypothesis we must include the fol- 
lowing procedures in the experimental design : 
(a) identify children who rank low in personal 
adjustment and in reading achievement, (5) 


provide an experience which is therapeutic in 
intent, (c) measure the effects of this experi- 
ence upon personal adjustment and reading 
performance, and (d) provide adequate con- 
trols so as to rule out alternative explanations 
of the experimental outcome. These points will 
be considered in greater detail below. 

Selection of experimental and control groups. 
The samples used in the study were drawn 
from the fifth- and sixth-grade classes of a 
large city. The children were predominantly 
of lower socioeconomic status. 

For the identification of children who ranked 
low in personal adjustment we used the Tud- 
denham form of the Reputation Test [10]. 
This is a peer-rating questionnaire which seeks 
judgments of specific behavior traits that can 
be assigned to positive-negative adjustment 
categories. 

For the selection of children low in reading 
achievement the Gates Reading Survey was 
used. The selection criterion here was a read- 
ing score significantly lower than average ex- 
pectation for the grade. This criterion is dif- 
ferent from the usual criterion of “reading 
retardation,” which is usually defined as read- 
ing performance significantly below expecta- 
tion for a given mental age level. The use of 
our criterion was predicated on the assumption 
that where a child was low in both reading 
and adjustment rank, intellectual functioning 
might be below potential. This assumption, of 
course, is implicit in the entire hypothesis of 
the study. We thus postulate that a thera- 
peutic experience would have a releasing effect 
upon intellectual functioning, which would be 
reflected operationally by a gain in reading 
scores. 


Experimental and control groups were se- 
lected in two stages: first, the 38 children with 
the lowest composite rank in both the Tud- 
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denham and the Gates were selected for the 
study; second, children were paired with ref- 
erence to score similarity and assigned one 
each to the experimental and control group. In 
this way, comparability of experimental and 
control groups was insured. The first row of 
Table 1 indicates the close similarity of initial 
scores. 


The pattern of testing. After the groups 
were selected, the Rogers Personality Test [8] 
was administered to the children in groups of 
four to six according to the test’s directions. 
This test was chosen to yield another measure 
of personal adjustment. 

At the end of the experiment, the Rogers 
and the Gates tests were given again to the 
experimental and control groups, and the Rep- 
utation Test was repeated with the classes 
from which the experimental and 
groups were selected. 


control 


The experimental treatment. For the dura- 
tion of the experiment the teacher-therapist 
met with the children for one-half hour daily 
in groups of four to seven. Varied materials 
were provided, including art materials, books, 
and games. The teacher made it clear to the 
children both at the outset and during the ex- 
periment that they could use the time as they 
wished: in talking, reading, working puzzles 
and games, drawing, or just sitting. The teach- 
er’s intent throughout the sessions was to main- 
tain an open, permissive, understanding atmos- 
phere which could encourage exploration and 
expression by the children. 

The children responded by using the time in 
varied ways. There was frequent interaction 
between children and teacher. Approximately 
half the time was spent on books and games, 
while the other half was divided between the 
use of art materials and no materials at all. 
The average number of sessions per child was 
67, distributed over a four-month period. 


Results 


The quantitative findings appear in Table 1. 
From the table several results are evident. 
First, it is clear that the experimental group’s 
reading gain is significantly greater than the 
gain of the control group. The experimental 
group’s gain was .69 years — i.e., about seven- 
tenths of a year gain in four months. 
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Table 1 


Reading and Personality Test Changes for 
Experimental and Control Groups 





rs a 


Gates test Reputation Test Rogers test 








Measure 33 : Pe 5 re : 
gis af 8 af 8 

Pretest 3.80 3.74 28.91 30.64 43.6 43.0 
mean scores 

Posttest 4.49 3.86 16.14 20.23 50.5 43.6 
mean scores 
Differences .69 .12 12.77 10.41 6.9 0.6 
p values of <.01 65 10>p>.05 
between- 

group 


differences 


In the Reputation Test no significant dif- 
ferences are evident. The Rogers test results 
show a more complex picture. In the Rogers 
test, the higher the score, the greater the mal- 
adjustment. The experimental group shows a 
pretest vs. posttest difference of 6.9 points in 
the direction of maladjustment; the control 
group’s change was .6 points. The difference 
between the experimental and control group 
changes just misses significance at the 5% 
level. We therefore conclude that there were 
no significant differences in Rogers test changes 
in this experiment, and also take note of the 
fact that the effect approaches significance in a 
direction contrary to expectation. 


Discussion 


The children in our experimental groups 
showed a relatively strong gain in reading but 
no gain in adjustment scores. Thus we do not 
have a simple confirmation or disconfirmation 
of our hypothesis, but a more complex state of 
affairs which invites speculative exploration 
here. Our very first note, of course, is to sug- 
gest a repetition of the experiment to determine 
whether either the reading results or adjust- 
ment results can be accounted for by experi- 
mental error. We would then know whether 
we are speculating about fact or illusion. But 
meanwhile our present results are real enough 
for the moment, and we shall try to reason out 
their possible meaning. 

A parsimonious conclusion is that a thera- 
peutic experience at school can yield gain in an 
intellectual function without corresponding 
gain in measured adjustment. Perhaps the ef- 
fect of freeing the child to learn can come 
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about as a specific effect without touching the 
more pervasive personality organization. In 
other words, perhaps therapy here ameliorated 
conflict about school learning without a more 
generalizing effect. 


This explanation must remain as one of the 
possibilities in the situation. On the other hand, 
we might do well to bring in another facet of 
the results as a way of seeking other explana- 
tions. In this connection we observe that the 
Rogers test yielded a nearly significant decre- 
ment in adjustment scores for the experimental 
group. One way of putting this finding is that 
there was a greater state of flux in the experi- 
mental group results than in the control group 
results. This fact has a close analogue to the 
indications in an unpublished manuscript by 
Bolgar.’ In a series of studies which analyzed 
Rorschach protocols before therapy, during 
therapy, and after therapy, Bolgar found test 
changes which indicated that, during therapy, 
disorganization as defined by poor form-quali- 
ty responses (the “minus” scores) was more 
in evidence than before therapy. She argued 
that a loosening-up process might be reflected 
in these scores. In the present study the greater 
downward shift in the experimental group 
scores might be reflecting this same loosening- 
up tendency—i.e., we may suspect that a 
tendency to disorganization is part of a thera- 
peutic process. We cannot, of course, do more 
than speculate about such a possibility, but we 
should also not like to see this possibility dis- 
missed. 


Summary 


This study was designed to test the hy- 
pothesis that a therapeutic approach to teach- 
ing modifies personality and intellectual per- 


1Bolgar, Hedda. Personal communication, 


formance. Comparable experimental and con- 
trol groups were selected and one group had 
daily sessions with a teacher-therapist who used 
therapeutic principles in her work with the 
group. The experimental group showed signifi- 
cant reading gains as compared with the con- 
trol group. No significant differences in per- 
sonality measures occurred. A trend toward 
decrement in adjustment scores for the experi- 
mental group was noted and discussed. 
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Patients to Freudian Sexual Symbols:.: 
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The University of Southern California 


The study was concerned with the extent 
of agreement by normals and mental hospital 
patients with the gender assigned by Freud [1 ] 
to words which purportedly were symbolic of 
male or female sexual organs. 

Subjects consisted of 99 mental hospital pa- 
tients drawn from five hospital wards, 12 hos- 
pital attendants, and 27 student nurses. The 
Ss were presented with 113 words which 
Freud's statements indicated to be typically 
symbolic of male or female sexual character- 
istics, and were requested to indicate which 
words reminded them of the male and which 
of the female sexual organs. 

The agreement with Freud was 65% for 
admission ward patients, 65% for long-term 
treatment patients, 48% for postlobotomy pa- 
tients, 62% for hyperactive patients, and 70% 
for parole ward patients. The agreement for 
nurses and for attendants was also 70%. 

The percentage agreement with Freud for 
normals and the total population of patients 
was significantly greater than chance, but also 


1An extended report of this study may be ob- 
tained without charge from Alfred Jacobs, Univer- 
sity of Southern California, Los Angeles, California, 
or for a fee from the American Documentation In- 
stitute. To obtain it from the latter source, order 
Document No. 4398 from ADI Auxiliary Publica- 
tions Project, Photoduplication Service, Library of 
Congress, Washington 25, D.C., remitting in ad- 
vance $2.00 for microfilm or $3.75 for photocopies. 
Make checks payable to Chief, Photoduplication 
Service, Library of Congress. 


2Sponsored by the Veterans Administration and 
published with the approval of the Chief Medical 
Director. The statements and conclusions published 
by the author are a result of his own study and do 
not necessarily reflect the opinion or policy of the 
Veterans Administration. 


differed significantly from perfect or 100% 
agreement. The percentage of agreement with 
Freud was significantly greater for normals 
than for postlobotomy patients (p = .001), 
hyperactive patients (p = .01), and long-term 
treatment patients (p= .001), and approached 
significance for admission ward patients (p = 
.10). The superiority of normals to patients 
occurs on significantly more than one-half the 
words presented (p = .01). The normals 
showed significantly less blocking than patients 
in completing the task (p= .01). 

The study lends support to the conclusion 
that considerable agreement exists between the 
gender assigned by Freud to sexual symbols 
and that assigned by both normals and patients. 
However, the agreement with Freud, even for 
groups of Ss, departs significantly from perfect 
agreement, and it is possible that cultural fac- 
tors either in Freud’s sample, or in that used in 
the present study may account for the differ- 
ences. 


The data also lend support to the conclusion 
that normals agree with Freud to a greater ex- 
tent than did most of the types of patients 
studied. It is not clear whether the poorer per- 
formance of the patients is attributable to their 
intellectual impairment, lack of social motiva- 
tion, or to the presence of more severe or more 
frequent sexual conflicts. 


Brief Report 
Received September 14, 1954. 
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from Rorschach Color Responses’ 


Robert W. Harrington 
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The proposition that FC responses on the 
Rorschach test indicate a higher level of emo- 
tional maturity than do the CF and C re- 
sponses has been expressed in various ways. 
Beck [1, p. 30] has arranged the color re- 
sponses in an order (C, CF, FC) which he 
considers to correspond to the stages of “‘emo- 
tional maturity” of an individual, and Hertz 
and Baker state, ‘““The CF and C together re- 
flect egocentricity and the more primitive, 
childlike emotions, hence immaturity” [3, p. 
13]. 

Clinical workers have, to a large extent, ac- 
cepted the above interpretation of color scores 
and are prone to differentiate, on the basis of 
Rorschach color scores, between the levels of 
emotional maturity of otherwise comparable 
individuals. The purpose of the present study 
was to test the proposition that FC responses 
are indicative of a higher level of emotional 
maturity than are C and/or CF responses. 


With the assumption that a positive rela- 
tionship exists between emotional maturity 
and adequacy of response to frustration, the 
hypothesis of the study was: Individuals whose 
color responses on the Rorschach test are pre- 
dominantly of the C and/or CF type will 
show greater impairment in performance un- 
der conditions of frustration than will those 
individuals whose color responses are predomi- 
nantly of the FC type. 


Method 


Two groups of Ss, differing in type of color 
emphasis on the Rorschach test, were tested 
under conditions designed to provide for the 


1This paper is based on a doctoral dissertation 
submitted to Michigan State College in 1953. The 
author is indebted to Professors S$. Howard Bartley, 
Alfred G. Dietze, and Albert I. Rabin for their sup- 
port and criticism throughout the study. 


appearance of habit interference. In this way, 
since habit interference involves interference 
with goal-directed behavior, the conditions of 
frustration were to be fulfilled. 


Subjects. The sample consisted of forty, 
white, male juvenile delinquents at Boys Vo- 
cational School, Lansing, Michigan.? “—I'wo 
groups of 20 Ss each were equated on the fac- 
tors of age and IQ. As can be seen in Table 
1, the groups did not differ significantly on 
these variables. 


Table 1 


Comparison of Age and IQ Data for Groups with 
Primary Color Emphasis (C + CF) and Secondary 
Color Emphasis (FC) on the Rorschach Test 

(N= 20 in each group) 








Group Age IQ* 

Mean Range Mean Range 
FC 15.85 13.0-17.0 102.75 76-128 
C+ CF 15.75 13.5-18.0 101.35 


73-125 





< Based on Wechsler-Bellevue scales 14). 


In addition, the two groups were equated on 
several Rorschach variables; namely, F+%, 
number of M, total number of Y responses, 
number of P, and total number of responses. 
All scores were based on individually adminis- 
tered, standard Rorschach examinations. 


The independent Rorschach variable was 
the ratio: Number of FC responses/ Number 
of C+CF responses. As a general rule this ra- 
tio had to be at least 2/1 for inclusion in the 
FC group and 1/2 for the C+CF group. The 
ratio of Mean Number of FC/Mean Number 
of C+CF was 3.0/0.35 for the FC group and 
1.0/3.55 for the C+CF group, with no sub- 


2The author wishes to express his appreciation to 
Mr. Robert Wisner, the superintendent, and to the 
staff of Boys Vocational School for their cooper- 
ation. 
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ject having a ratio which was different in di- 
rection from that of his group. It can be seen 
from Table 2 that the two groups were rough- 
ly equated in regard to the frequency of all 
the Rorschach factors considered with the ex- 
ception of the color scores. All Rorschach re- 
sponses were scored by the investigator accord- 
ing to Beck’s system, and the scoring of the 
color responses was checked by another psy- 
chologist.* Noncolor responses about which 
there was some question were similarly checked. 


Table 2 
Comparison of Rorschach Scores for Groups with 
Primary Color Emphasis (C + CF) and Secondary 
Color Emphasis (FC) on the Rorschach Test 














Group 
Scores FC C+ CF 
M SD M_ SD diff. tf 

Total R 28.60 8.69 28.55 10.13 .05 .02 
Number P 7.20 2.29 6.30 1.86 .90 1.36 
Number M 1.45 1.12 1.70 1.23 .25 .65 
Number Y 

(total) 3.20 3.07 2.55 2.48 .65 .72 
F+% 81.40 8.82 75.75 11.78 5.65 1.11 
Number C 00 .00 1.05 1.43 1.05 3.28* 
Number CF o> we 2.50 1.28 2.15 7.41* 
Number FC 3.00 1.30 1.00 1.41 2.00 4.65* 





Note.—Corresponding p values were obtained when 
Fisher’s exact test was applied. 
* Indicates a p of <.01 (two-tailed test). 


Materials and Apparatus 


Two experimental tasks were used to com- 
pare the performances of the groups under 
frustration: (a) the code-substitution test, 
and (4) the mirror-tracing test. Both tests 
were administered individually. 

The code-substitution test. Two alternate 
forms of a code-substitution test were used. 
Both forms utilized the same numbers and geo- 
metric figures but the codes, given at the top 
of the form, were different for Forms A and 
B. The task of the S was to write the appro- 
priate number within each figure as rapidly as 
possible. The E said “wrong” if § made a mis- 
take and S corrected the error by writing in 
the correct number over the wrong one. The 
Ss were specifically instructed not to erase 
since to do so would slow them down. Per- 


8The help of Dr. Durand F. Jacobs is gratefully 
acknowledged. 
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formance was scored for the time per trial and 
number of wrong substitutions. 

The mirror-tracing test. On the mirror- 
tracing test Ss had to trace a six-pointed star 
design under conditions of mirror vision. The 
design was constructed from copper plates 
with a chronograph, counter, and stylus con- 
nected to the plates to provide a measure of 
the total time spent off the pathway and the 
number of times § left the pathway. In addi- 
tion, a stop watch was used to measure the 
total time spent in tracing the design. 


Rationale for the experimental tasks. On 
both the code-substitution and mirror-tracing 
tests, it was assumed that previously acquired 
response tendencies would tend to interfere 
with the learning of new responses to old 
stimuli. 

On the code-substitution test Ss were given 
a number of trials on one form and then tested 
on an alternate form. Essentially the same con- 
ditions existed on the mirror-tracing test, 
since the previously acquired response tend- 
encies involved in tracing a straight line under 
direct vision could be expected to interfere 
with attempts to trace a straight line under 
mirror vision. Since there was no reason to 
assume differential amounts of experience 
among Ss in making eye-hand movements in 
space, no training was given in tracing the 
maze under direct vision. 


Procedure. Prior to testing, E talked with 
each S§ and emphasized the facts that E had 
no connection with the institution and that 
participation in the experiment was entirely 
voluntary. After ascertaining that § was will- 
ing to participate, E said, “Fine, when we are 
all through, if I think you have done your 
best and really tried, I want to give you a 
candy bar for helping me out” (the candy bars 
were exposed to the view of S), and the re- 
ward was again emphasized at the conclusion 
of the instructions by E’s saying, ““Remember, 
if you do your best you get a candy bar.” Can- 
dy was selected as a reward because of two 
factors: (a) it was acceptable to the institu- 
tion, and (4) candy is highly valued in such a 
setting. 

On the code-substitution test, each § was 
given 18 trials on Form A followed by 12 
trials on Form B. Trials were given in groups 
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of three, with two minutes between each set 
of three trials. During the two minutes be- 
tween sets E engaged S in trivial conversation 
and before each group of three trials E said, 
“Let’s see how fast you can do these.’”’ After 
one and one-half minutes of the rest period 
following trial 18 on Form A had elapsed, E 
placed a copy of Form B in front of § and 
said, “This is different in two ways. First, 
we will not do as many of these as we did of 
the others, and secondly, it’s different in that 
the numbers have been changed around, see? 
Otherwise, it is just like before; if you make 
a mistake and I see it, I'll say ‘wrong.’ If I 
don’t see it, but you do, correct it anyway. Re- 
member to work as fast as you can. Ready— 
go.” Except for the number of trials the pro- 
cedure for Form B was exactly the same as in 
the preceding 18 trials on Form A. 

Each S was then shown the mirror-tracing 
apparatus and £ explained its operation (di- 
rect vision). Each § was then told: “Now you 
take the stylus and place it at the start (direct 
vision). When I say ‘go’ the only place you 
can look is in the mirror (design was then 
shielded from direct vision). No matter what 
happens, you cannot look any place but in the 
mirror. Ready—go.” Many Ss reached a point 
where they apparently could not continue trac- 
ing and if they gave any indication of discon- 
tinuing, E said, “Just keep at it, you'll get 
on to it.” 

Upon completion of the mirror-tracing task 
E thanked each § and gave him a candy bar 
as the reward which had been mentioned in 
the instructions. Each § was requested not to 
discuss the tests with any of the other boys 
and was excused from the room. 


Results 


Four comparisons were made _ between 
groups on the code-substitution test: (a) per- 
formance on Form A, (4) performance on 
Form B, (c) impairment scores (B—A), and 
(d) number of wrong substitutions on each 
form. 

It can be seen from Table 3 that there was 
no significant difference between the groups 
on Form A of the code-substitution test. It is 
also apparent that on the last 10 trials the per- 
formance (i.e., habit strength), of the two 
groups was nearly identical. In view of the 
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results on Form A, the finding that the FC 
group obtained better scores on all 12 trials 
on Form B is taken to support the hypothesis 
that the C+CF group would perform less ade- 
quately than the FC group under the experi- 
mental condition of interference with goal-di- 
rected behavior. 


Table 3 


Mean Scores in Seconds on Each Trial of Two 
Forms of a Code-Substitution Test for Groups with 
Primary Color Emphasis (C + CF) and Secondary 

Color Emphasis (FC) on the Rorschach Test 

















Group FC Group C + CF 
Trial Form Form ‘Form Form 
A B A B 
1 80.25 94.45 80.50 102.45 
2 70.45 79.35 64.85 86.55 
3 65.35 77.70 61.25 $3.55 
+ 58.45 75.30 56.30 78.25 
5 59.10 75.10 56.45 80.95 
6 58.10 72.60 56.00 77.55 
7 52.90 69.20 50.35 72.45 
8 53.95 68.05 54.60 73.60 
9 52.80 66.35 52.50 73.60 
10 48.90 60.15 48.75 69.75 
11 51.10 61.50 §2.25 67.60 
12 51.40 60.15 50.80 66.20 
13 48.85 49.70 
14 51.35 50.10 
15 51.00 51.10 
16 46.90 48.00 
17 48.40 48.45 
18 49.10 48.75 





Impairment scores were computed by sum- 
ming the scores for each series of three trials 
on Form A and subtracting this sum from that 
obtained for each of the corresponding series 
of three trials on Form B. A total impairment 
score, based on all 12 trials, was similarly com- 
puted. 

Table 4 reveals that on each of the four 
series of three trials the groups differed sig- 
nificantly, in the expected direction, either in 
mean scores, variability, or both. 

Similarly, a comparison of the groups on 
total impairment scores reveals significant dif- 
ferences, again in the predicted direction, in 
both mean scores and variability. These data 
are presented in Table 5. 

When the groups were compared on num- 
ber of errors (i.e., placing a wrong number 
in a figure), no significant differences were 
found. 








Fk 





458 Robert W. Harrington 


Table 4 


Comparison of Mean Impairment Scores for Each 
Series of Three Trials on a Code-Substitution Task 
for Groups with Primary Color Emphasis (C+CF) 
and Secondary Color Emphasis (FC) on the 
Rorschach Test 








Series (trials) 








I i il IV 

Group (1, 2, 8) (4,5,6)  (7,8,9) (10, 11, 12) 

TY sb WD MW SD MW yD 
FC 85.45 22.47 47.35 28.67 48.95 25.65 30.40 18.44 
C+CF 65.95 35.68 67.45 41.98 62.20 45.32 51.75 41.88 
diff. 80.50 20.10 18.25 21.35 
tt 3.23°* 1.86* 1.57 2.09* 
F 2.52 3.15* $.12° 5.16** 





Note.— Where the F ratio was significant the correc- 
tion indicated by Edwards [2, p. 170] was made in ap- 
plying the t test. 

+ One-tailed t test. 
* Indicates p of <.05. 

** Indicates p of <.01. 


On the mirror-tracing test, the two groups* 


were compared as to total time spent tracing 
the design, number of times the stylus left the 


Table 5 


Comparison of Total Impairment Scores in Seconds 

on a Code-Substitution Test for Groups with Pri- 

mary Color Emphasis (C + CF) and Secondary 
Color Emphasis (FC) on the Rorschach Test 











Group Mean SD diff. tt F 
FC 157.15 64.12 





90.20 2.46* 5.24** 
C+ CF 247.35 146.76 
+ Corrected for significant F. 


* Indicates p of <.02 (one-tailed test). 
** Indicates p of <.01. 





pathway, and total time spent off the path- 
way. None of the differences between the 
groups approached significance and it must be 
concluded that the groups could not be reli- 
ably differentiated on the basis of their per- 
formances on the mirror-tracing test. 


Thus, while the results of the code-substi- 
tution test clearly support the hypothesis, the 
results of the mirror-tracing test fail to lend 
such support. Whether the failure of the mir- 
ror-tracing task to differentiate the groups can 
be explained on the basis of the type of be- 


One S in each group violated the instructions, 
hence, the N for this task is 19 in each group. 


havior involved, the manner in which the test 
was administered, or whether the reaction to 
habit interference is itself quite labile remains 
an open question. The writer is inclined to 
think, however, that the failure to provide 
practice under direct vision prior to testing 
under mirror vision was a serious error in the 
way the test was administered. 


Summary 


The purpose of this experiment was to test 
the proposition that C and/or CF responses 
on the Rorschach test are indicative of a lower 
level of emotional maturity than are the FC 
responses. 


Two groups of Ss, equated on age, IQ, and 
several Rorschach variables, but differing in 
type of color emphasis on the Rorschach, were 
subjected to frustration in the form of habit 
interference. 


On the premise that a positive relationship 
exists between emotional maturity and ade- 
quacy of reaction to frustration, the group 
whose Rorschach color scores were predomi- 
nantly of the C and/or CF type were expected 
to show greater impairment in performance 
under frustration than the group whose color 
responses were predominantly of the FC type. 
Of the two experimental tasks used, only one 
reliably differentiated the groups. It is con- 
cluded that while the results lend some sup- 
port to the hypothesis, it remains an open ques- 
tion whether the failure of the second task to 
differentiate the groups is due to the task it- 
self or to the lability of reaction to habit in- 
terference. 
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Homosexuality in Paranoid Schizophrenia as 
Revealed by the Rorschach Test’ 


David Grauer 


VA Hospital, Hines, Illinois 


The present study was an attempt to test 
Aronson’s [1] finding that the Rorschach rec- 
ords of paranoid psychotics reveal homosexual 
tendencies. In the course of another investiga- 
tion described elsewhere [4], the present writ- 
er analyzed the Rorschach records of a group 
of paranoid schizophrenics. Care was taken to 
eliminate all cases in which there might be 
reasonable doubt of the adequacy of the diag- 
nosis. This was assured, first, by a study of 
psychiatric case records to determine the pres- 
ence of unequivocal evidence of hallucinations 
and delusions and, second, by the fact that the 
diagnosis was a “final” one, agreed upon by 
at least two psychiatrists after a total hospitali- 
zation period ranging from three months to 
over two years. 


These criteria differ from those of Aronson, 
who used ratings of case records. The rating 
scale was designed to estimate the “extent to 
which the delusions pervaded the patient’s 
symptomatology” [1, p. 403]. The scale ex- 
tended from a score of zero, “minimally delu- 
sional,” to 6, “maximally delusional.” Only 
patients with a rating of at least 4 (‘“‘markedly 
delusional’) were selected for the experiment. 


The final group of paranoid patients select- 
ed by Aronson consisted of 30 individuals, 28 
of whom had, in fact, been given a diagnosis 
of paranoid schizophrenia by the psychiatric 
staff. Aronson’s technique selected out of the 
total group of paranoid schizophrenics avail- 
able those whose delusions were considered to 
have the greatest influence. The published 
data did not specify the total number of para- 
noid schizophrenics in the preliminary sample, 


1From VA Hospital, Hines, Illinois. The writer 
wishes to express his appreciation to Dr. Roy Bren- 
er, Chief Clinical Psychologist, for his assistance in 
scoring the Rorschach protocols and for his coop- 
eration in making this study possible. 


but the total number of psychiatric records ex- 
amined was given as 500. If the experience of 
other psychiatric wards in VA hospitals can be 
taken as a point of reference, it is estimated 
that a fairly large proportion of the patients 
would have been diagnosed as paranoid schizo- 
phrenics. The 28 cases finally selected would 
then constitute a small minority of all the pa- 
tients thus diagnosed. 

The fact that the Freudian hypothesis re- 
garding homosexuality and paranoia could be 
verified in only a small proportion of cases of 
paranoid schizophrenia would, in the writer’s 
opinion, seriously limit the applicability of the 
hypothesis. After all, Freud assumed that la- 
tent homosexuality was an important factor in 
the paranoid group as a whole. 

In other respects, the group of 31 paranoid 
schizophrenics in the present sample closely 
approximated Aronson’s paranoid patients. 
Both groups were composed of male war vet- 
erans, similar in average age (28 years vs. 30.8 
years), intelligence (IQ 100 vs. “‘roughly aver- 
age intelligence”), and educational level (com- 
pletion of 10 grades vs. 10.9 grades). 


Procedure 


The Rorschach protocols of the 31 paranoid 
schizophrenics in the present investigation were 
scored for the 21 signs of homosexuality select- 
ed by Aronson from the lists of Wheeler [7] 
and Reitzell [6]. The scoring was done inde- 
pendently by two clinical psychologists includ- 
ing the writer. Differences in scoring between 
the two psychologists were slight and were 
easily reconciled after a conference on discrep- 
ant points. 

In accordance with Aronson’s procedure, 
after the number of homosexual signs present 
in each patient’s protocol had been tabulated, 
the mean number of signs, mean number of 
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signs based on first response to each card, 
and mean percentage of signs to total num- 
ber of responses were calculated. In addi- 
tion, an attempt was made to determine the 
prognostic significance of homosexuality, in 
view of Wheeler’s assertion that the finding of 
homosexuality was generally considered to be 
an unfavorable sign by psychotherapists. Since 
the present group of patients was composed of 
an equal number of improved and unimproved 
patients, it was possible to ascertain the prog- 
nostic value of this factor. 


Results 


Table 1 shows comparisons between the pres- 
ent data and those for Aronson’s normal, non- 
paranoid psychotic, and paranoid groups. As 
can be seen, the number and percentage of 
homosexual signs are definitely lower in the 
present group than in Aronson’s for every 
comparison. Aronson also reported highly 
significant differences in homosexual signs be- 
tween his paranoid cases and his normal group, 
and between his nonparanoid psychotics and 
his paranoids (p = .001). The differentiation 
between Aronson’s normal patients and the 
nonparanoid psychotics was much less marked. 
Most of the latter differences were either not 
statistically significant or of borderline signifi- 
cance (p= .02 to .10). 











Table 1 
Homosexual Signs in Rorschach Records 
a a = _ a $< 
Z eo” 3 “ 9 
FI S 
P $3 Heet 
z & 3 : o28 
ce S33 ee 
Group N § & 5 
Zz? Ble wees 
Mean @ Mean @ Mean @ 
Aronson’s 
normals 30 «61.10 1.10 4.9 5.4 57 92 
Aronson’s 
paranoids 710 4.21 22.9 12.3 3.47 2.05 
Aronson’s 
nonparanoid30 1.90 1.96 85 7.5 .87  .88 
psychotics 
Grauer’s 


paranoid 31 3.07 3.14 12.7 11.1 1.36 1.66 
schizophrenics 





The most important comparison is that be- 
tween the paranoid psychotics and the non- 
paranoid psychotics. If it can be shown that the 


paranoids show significantly gregter homosex- 
ual tendencies than the nonparanoid psychotics, 
Freud’s theory is confirmed. Aronson’s finding, 
previously mentioned, revealed a highly signifi- 
cant difference between the paranoid and non- 
paranoid psychotics in this respect. ‘he present 
data do not, however, confirm his results. As 
can be seen from the table, the mean number of 
homosexual signs for the paranoid group was 
3.07, which differed from the mean number of 
signs given by Aronson’s nonparanoid psy- 
chotics by 1.17. The ¢ was 1.75, which failed 
to reach a minimum / of .05, revealing no 
statistically significant difference in homosex- 
ual signs between the present paranoid schizo- 
phrenics and Aronson’s nonparanoid psychotics. 
Similar calculations of percentages and signs 
on first response on each card revealed ¢ values 
of 1.71 and 1.44, which were also below accept- 
able standards of significance. 


Another comparison is suggested by Wheel- 
er’s original study of Rorschach signs of homo- 
sexuality. Wheeler found a median score of be- 
tween two and three homosexual signs in his 
group of 100 psychiatric patients in a VA 
mental hygiene clinic. These patients no doubt 
included various diagnostic categories with a 
preponderance of neurotic patients. Of the 31 
paranoid schizophrenics in the present study, 
17, or 55%, had less than three signs. On this 
basis, since Wheeler regards total number of 
signs as the significant factor, there would seem 
to be considerable overlap in degree of homo- 
sexuality shown by the paranoid group and 
Wheeler’s predominantly neurotic patients. 


A comparison of the current group with 
Aronson’s group of paranoids shows that of the 
31 patients, 28, or 90%, did not attain the 
mean homosexual score of Aronson’s paranoid 
patients. On the basis of percentages, the ratio 
of homosexual signs to total number of re- 
sponses in each record, 25, or about 81%, of the 
paranoid schizophrenics failed to reach the 
mean percentage of homosexual signs of Aron- 
son’s group of paranoid psychotics. 

In view of the commonly expressed opinion 
that the finding of latent homosexuality consti- 
tutes a prognostically unfavorable sign, it was 
decided to test this hypothesis on the present 
group, which consisted of improved and unim- 
proved patients. The mean number of homo- 
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sexual signs per first response to all the cards 
was 1.31 for the improved group, 1.76 for the 
unimproved group. The mean percentage of to- 
tal homosexual responses was 15.6 for the im- 
proved group and 15.9 for the unimproved 
group. It is thus seen that the latent homosex- 
uality, insofar as it can be estimated from the 
Rorschach record by means of this technique, 
had no prognostic significance for the group 
of paranoid schizophrenics. 


Discussion 


The data failed to confirm Aronson’s find- 
ing that the Rorschach records of paranoid 
psychotics exhibit more homosexual content 
than those of nonparanoid psychotics, as would 
be expected from Freud’s [3] hypothesis re- 
garding paranoia. A comparison of the results 
with those of Wheeler [7] suggested that, in 
terms of total number of homosexual signs 
found, there was little difference in homosexual 
tendency between the present paranoid schizo- 
phrenics and the general run of psychiatric pa- 
tients in an outpatient mental hygiene clinic. 
Furthermore, the fact that most of the psychia- 
tric patients (60%) investigated by Wheeler 
were rated by psychiatrists as having latent or 
overt homosexual tendencies throws some 
doubt on the problem. To what extent is the 
paranoid psychotic unique in showing homo- 
sexual trends if most of the members of a heter- 
ogeneous group of psychiatric patients exhibit 
similar tendencies ? 


According to Fenichel [2], Freud’s theory 
of the relationship between latent homosexuali- 
ty and paranoia has been confirmed in all 
psychoanalytic investigations. Yet psychiatrists 
who have investigated large numbers of para- 
noid patients challenge this view. On the basis 
of their experience, Henderson and Gillespie 
concluded: ““The importance of homosexuality 
in the aetiology of paranoia is not so wide- 
spread as the psychoanalytic school would have 
it” [5, p. 385]. 


In Aronson’s [1] review of the psychiatric 
literature he reported the incidence of homosex- 
ual material found in studies of paranoid 
psychotics. With one exception, the highest per- 
centage of paranoid patients showing homo- 
sexual tendencies was 20. The exception is the 
report of an analyst who found that 55% of a 
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group of paranoid scizophrenics showed homo- 
sexual tendencies in their psychotic behavior. 
This higher incidence can be explained by the 
criterion of homosexuality used by the analyst. 
In addition to the usual behavioral evidence 
utilized by the other investigators, the analyst 
included behavior showing 
sions of homosexuality.” 

It is thus seen that clear evidence of homo- 
sexual behavior is found in only a minority of 
paranoid patients. If latent homosexuality were 
of such great importance in the etiology of par- 
anoia as the Freudian hypothesis would indi- 
cate, it is difficult to explain why these tenden- 
cies do not become overt in the majority of 


“symbolic expres- 


cases of paranoid psychoses. One would expect 
that, with the breakdown of inhibition and de 
fense that occurs in a psychosis, a larger propor 
tion of paranoids would exhibit frank homosex 
ual material if Freud’s hypothesis were correct. 
Summary 
In this study an attempt was made to test the 
findings of Aronson that the Rorschach records 
of paranoid psychotics contain evidence of a 
higher degree of homosexual tendency than 
those of nonparanoid psychotics. The subjects 
in the present investigation consisted of a com- 
parable group of 31 paranoid schizophrenics. 
The Rorschach records were scored for homo- 
sexual content by means of Wheeler's signs. A 
comparison of the records of the paranoid 
schizophrenics in the present study with those 
of Aronson’s nonparanoid psychotics reveals 
no statistically reliable difference. It was also 
shown that homosexual tendencies, as measur- 
ed by Wheeler’s signs, have no prognostic signi- 
ficance for the group of paranoid schizophren- 
ics. The results of the present study are related 
to clinical psychiatric findings. 
Received April 26, 1954. 
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Influence of the Preceding Test on the 
Rorschach Protocol’ 


Robert G. Gibby, Bernard A. Stotsky 


VA Mental Hygiene Clinic, Detroit, Michigan 


and Daniel R. Miller 


University of Michigan 


To what extent is the Rorschach protocol 
influenced by the set created by the preceding 
administration of another psychological instru- 
ment? In clinics, the Rorschach test is typical- 
ly given as part of a battery. If the protocol 
should vary with the nature of the previ- 
ous technique, it would become necessary to 
standardize the order of presentation. 


Recent studies of the Rorschach test point 
to its sensitiveness to many factors which are 
not currently controlled in its administration. 
Protocols differ depending upon the atmos- 
phere in which the test is taken [7], and 
whether or not anyone has troubled to tell the 
subject why he is being tested [9]. Also im- 
portant are variations in the perceptions of the 
task such as definitions of the purpose of the 
test and degree of ego involvement [1, 3]. 
Different examiners, depending upon their 
sex [4] and needs [8], obtain significantly dif- 
ferent scoring categories from comparable 
groups of subjects [2, 6, 11], and they vary 
in their interpretations of the same test rec- 
ords depending upon their own personalities 
[5]. In short, the Rorschach protocol reflects 
not only the subject’s projections but also the 
definition of the task, the nature of the setting, 
the examiner’s characteristics, and the unique 
aspects of the relationship between examiner 
and subject [10]. 


This paper is concerned with the possible 
influence on the Rorschach protocol of one as- 
pect of the setting, the experience of just hav- 
ing taken a previous test. The writers were in- 
terested particularly in the possibilities that the 
taking of an intelligence test would create a 
readiness to give a large number of Rorschach 


1From VA Regional Office, Detroit, Michigan. 


responses, that a thematic instrument might 
predispose the subject to see motion, that col- 
ored blocks would sensitize him to color, 
and that a drawing test would elicit a set to 
focus on form. 


Procedure 


Subjects were selected at random from 
among patients at the Veterans Administra- 
tion Mental Hygiene Clinic at Detroit, Mich- 
igan. Most of them were neurotics but there 
were some character disorders and ambulatory 
psychotics. For purposes of control, it was de 
cided to exclude females, Negroes, organics, 
disoriented psychotics, and those with [Q’s be- 
low 80. 

The procedure consisted of administering 
to each § one of four initial instruments, the 
Bender-Gestalt, Thematic Apperception, 
Wechsler-Bellevue,? or Goldstein-Scheer- 
er test, and then the Rorschach test. There 
were five categories of Ss with twenty in each 
category, including a control group that re- 
ceived no test prior to the Rorschach. 

Each S§ was referred to one of 16 examiners 
who was asked to administer the two tests in 
a particular order and with standardized in- 
structions. The tests preceding the Rorschach 
were assigned at random before any back- 
ground information concerning the patients’ 
problems was available. Once the protocols 
were obtained, they were coded and then 
scored by one of the writers. Ten records were 
then rescored by another psychologist for pur- 
poses of testing reliability. Eleven variables 
were tallied: number of responses (R), human 
movement (7), pure form (F), total shad- 


2In order to eliminate color from this test, the 
Kohs Blocks were not used during the experiment. 
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ing (TSH), total color (Tot C), whole 
(W), common detail (D), rare detail (Dd), 
human (#7), animal (4), and number of 
content categories. The average coefficient of 
reliability of the scoring categories is over .90, 
and none is below .80. 


Results 


To test the over-all differences among Ror- 
schach protocols administered under these five 
conditions, analyses were made of the vari- 
ances of the eleven scoring symbols. None of 
the results attains or even approaches signifi- 
cance. In fact, almost all of the F ratios are 
close to 1. It can be stated with considerable 
confidence, then, that for this sample, there is 
no tendency for an administration of the Ben- 
der-Gestalt, ‘Thematic Apperception, Wech- 
sler-Bellevue, or Goldstein-Scheerer tests to in- 
fluence subsequent performance on the Ror- 
schach test. 

Summary 

Because of the sensitivity of the Rorschach 
test to variations in the setting, definition of 
the task, and examiner’s characteristics, it was 
hypothesized that protocols would vary in ac- 
cordance with the nature of the initially ad- 
ministered test. No significant differences were 
found for 11 scoring categories with respect to 
the five experimental conditions: when the test 
was given first or when it was preceded by 
either the Bender-Gestalt, Thematic Appercep- 


tion, Wechsler-Bellevue, or Goldstein-Sheerer 
tests. 
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An Attempt to Influence the Rorschach Test by 
Means of a Peripheral Set 


Josephine C. Kurtz and Margaret M. Riggs 


University of Connecticut 


In using a projective technique, the clinician 
usually starts from the premise that the for- 
mally scored variables reflect relatively central 
or permanent aspects of the personality. He 
does not expect momentary situational factors 
or the subject’s casual expectations about the 
task to distort, or even seriously color, these 
scored variables. 

For the Rorschach test, this assumption has 
come under scrutiny via two main avenues. 
First, both clinical and experimental evidence 
[1, 14] indicates that transient but dynamic- 
ally important personality tendencies may be 
evoked by the interpersonal relationship in- 
duced by the examiner [9], by some strong in- 
fluence immediately prior to testing [8], or by 
the test situation itself [7]. In such instances, 
the Rorschach may reflect the subject’s typical 
reaction in this sort of situation but may not be 
representative of his usual behavior. 

Secondly, there have been several attempts 
to distort test performance by wholly peripher- 
al, and supposedly non-ego-involving sets. To 
the extent that this is possible it raises devas- 
tating doubts about the premise that the scored 
variables necessarily reflect central or perman- 
ent personality variables. Hutt et al. [6] and 
Gibby [3] told subjects to try to see segments 
of the blots rather than wholes, and asked them 
to perceive movement whenever possible. They 
found significant increases in both perception 
of details and of movement. Rather than tell- 
ing subjects directly how to alter their records, 
Norman, Liverant, and Redlo [11] attempted 
to induce unconscious sets toward food and hu- 
man movement by showing a series of picture 
advertisements and then administering the Ror- 
schach immediately thereafter. These records 
were compared with Rorschachs given one 
week earlier or later. The results were com- 
pletely negative ; no increase in food or human 
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movement responses was apparent. Unfortu- 
nately, there was no independent proof that 
viewing the pictures had indeed induced any 
set. Thus the evidence that the Rorschach is 
resistant to such temporary sets was suggestive 
but not conclusive. 

In the present study we attempted to set up 
an unconscious peripheral set to perceive a 
large number of animals on the Rorschach ; the 
influence was expected to spread to other fac- 
tors associated with, or to some extent depend- 
ent upon, animal percepts. Rorschach himself 
noted [13] that 4% and F+ might be in- 
creased, but at the expense of W, M, and O. 
Lord [9] showed that 4% varied with extent 
of rapport; Abramson [1] found that 4% 
changed as a concomitant of a set to see more 
W or more D. For Hutt [6] 4% was among 
the less reliable of the variables measured. 
Thus there was some reason to suppose that 
A% might be vulnerable to peripheral influ- 
ences. We employed the Rees and Israel tech- 
nique [12] to induce a set to perceive animals. 
Briefly, subjects are shown pseudo words (e.g., 
chack) too rapidly for accurate perception. 
The experimental subjects are told that they 
would see animals and birds, while the control 
group is given no special slant. Ordinarily a set 
is created strongly enough so that without fur- 
ther instruction it carries over into a second 
task consisting of completing such partial words 
as § - - 1. Thus the second task provides a di- 
rect measure of the existence of the unconscious 
set. The following hypotheses were set up, uti- 
lizing this technique: 

1. Immediately after demonstrating set on 
the word-completion task, the experimental 
subjects will perceive more animals on a group 
Rorschach test than the controls who received 
no such set. 


2. There will be more P, D, and FM but 
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fewer M, vista, and diffusion responses among 
the experimental subjects in comparison with 
the controls. 

3. If statistically significant differences are 
found for others of the conventionally scored 
variables, these will tend to indicate less cre- 
ativeness and less anxiety or “inner ferment” 
among the experimental subjects than among 
the controls. 

If all three hypotheses are proven untenable, 
clinicians would be supported in the assump- 
tion that temporary sets of a peripheral sort do 
not carry over into the Rorschach test. If the 
first is supported but not the other two, then 
such effects as might exist probably would not 
influence the total personality structure too 
seriously. If either the second hypothesis (con- 
cerned with the direct effect of increased ani- 
mals on other scores) or the third (embodying 
the assumption that a set is reassuring to a sub- 
ject but also dissuades him from trying varied 
approaches) is verified, then clinicians might 
be obliged to scrutinize a little more carefully 
even the fortuitous expectations their subjects 
bring to a projective task. 


Procedure 


Three separate laboratory sections of intro- 
ductory psychology classes were utilized ; each 
was divided randomly into experimental and 
control groups. In the experimental groups 
there were 21 males and 10 females; in the 
control groups there were 24 males and 7 fe- 
males. ‘Ihe mean age of the control group was 
21.5, of the experimental group 22.1, with 
SD’s of 3.32 and 3.38 respectively; the mean 
difference of .6 was insignificant. 

The control groups received the following 
instructions : 


Eight groups of letters will be shown to you 
rapidly. The exposure time for each group of letters 
is very short, so you will have to watch closely. Two 
practice items will be given to help you adjust your- 
self to the task. Since the exposure time is very short, 
set yourself to pay careful attention and to observe 
the groups of letters as carefully as possible. Do not 
speak out the answers nor ask any questions until the 
whole experiment is over. Record below in the 
proper order what you perceive in each case. 


The experimental groups received similar in- 
structions except for the set-inducing phrases: 


Eight words will be shown to you rapidly. The 
exposure time for each word is very short, so you 
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will have to watch closely. Two practice words will 
be given to help you adjust yourself to the task. Some 
of the members of the class belong to the “naive” 
group and do not know what the words are about. 
You belong to the “sophisticated” group; hence the 
following information is given you: Most of the 
words you are going to see are words having to do 
with animals or birds. Set yourself accordingly so 
that you will perceive as many of the words as pos- 
sible. Do not speak out the answers, nor ask questions 
about the words, for it is important that the “naive” 
group get no hint as to the nature of the words. 
Record below in the proper order the words which 
you perceive. 


After completing the first task, all groups 
were introduced to the second task as follows: 


Below you will find 15 skeleton words. Your task 
is to find real words (not slang nor proper names) 
which can be made out of the skeleton words by 
filling in the blanks. In each case you are to record 
the first word that you find fitting the requirements, 
and you are to see how quickly you can solve the 
15 items. You are to use your pencil only in record- 
ing each word after it has been discovered; do not 
write anything during the process of solution. Since 
this is a speed test, fill in the blanks as quickly as 
possible. Record your answers and the time taken 
to complete the list below. 


As soon as the second task had been com- 
pleted, the room was darkened and the group 
Rorschach was administered, with the follow- 
ing instructions: 


You know when you drop ink on a piece of paper 
and fold it and open it again, the ink makes weird- 
shaped blots or splotches on the paper. You will be 
shown this sort of inkblot projected on the screen. 
The ones shown have been selected from thousands 
because they can be seen many different ways by 
different people. Look at them as they come, and 
without telling your neighbor, write down on the 
left-hand side of the paper what they look like, what 
things you see in them, what they might resemble. 
You will have three minutes on each slide, to write 
answers at your own speed. Do you understand? 


All questions were answered permissively. 
After all ten slides had been exposed the in- 
quiry was introduced as follows: 


Now I am going to put each slide back on for 
about a minute and a half. On the right side of your 
paper, opposite each response, write down anything 
more about each response; whatever seemed im- 
portant to you; just how it was, or what kind it 
was, or what aspect of the blot was making it look 
more like that thing than anything else. This is so 
we can be sure to score what mattered most to you, 
since different people may notice different things 
about the very same answer. 
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After this was done, location sheets were 
passed out, and subjects were asked to draw a 
circle around each response if it did not include 
the whole blot, and label each circle. 


This inquiry was modified from the original 
Harrower-Erickson version [5] in that color, 
movement, etc. were not suggested, in order 
to avoid dampening any spontaneous effects of 
set that might be present. 

Without prior knowledge as to which were 
experimental and which control, the Ror- 
schachs were scored according to the Klopfer 
system with the addition of Rappaport’s com- 
binatory score, and a special “Total Animal” 
index. This consisted of the number of separate 
animals mentioned as main, additional, or al- 
ternate, expressed as percentage of the number 
of main responses plus the number of addition- 
al animal responses. For all analyses, the three 
laboratory sections were not differentiated. If 
fewer than 5 of the 62 subjects scored zero on 
a particular variable it was considered continu- 
ous; for such variables the arc-sine correction 
for percentages [15] was employed. The re- 
sulting distributions were fairly normal ; ¢ was 
used as a test of significance only for these 
variables. For the rest, chi square was used, 
with 2 by 2 tables constructed according to ex- 
perimental vs. control and high score vs. low 
score, the cutting point being that which most 
nearly divided the total population into halves 
of equal number [2]. 


Results 
As shown in Table 1, on both Task I and 


Table 1 


Means, SD’s, and Significance of Differences between 
Experimental and Control Groups for Tasks I and 
II and for Each Half of Task II 














Mean SD 

Exper. Control Exper. Control t P 
Task I 
(8 items) 5.1 26 1.65 1.17 6.77 <.001 
Task II 
(1Sitems) 9.5 63 3.34 247 422 <.001 
Task II, 
first half 
(8 items) 50 3.5 1.98 1.64 320 <.01 
Task II, 
second half 


(7 items) 44 2.9 1.67 1.29 3.89 <.001 





Task II experimental and control groups were 
quite significantly different; moreover, when 
Task II is broken into halves, the two groups 
actually differ more significantly on the second 
half than the first. Thus a set was not only 
established in the first task and continued into 
the second, but the difference between groups 
was not diminishing. 

In spite of the strength of this demonstra- 
tion of set in Task II, the results for the Ror- 
schach were essentially negative. As shown in 
Table 2, no significant differences existed for 


Table 2 


Means, SD’s and Significance of Differences between 
Experimental and Control Groups on 
Rorschach Variables 


Mean SD 





Variable Exp. Cont. Exp. Cont. t Pp 
A% 42.4 41.5 7.9 8.8 417 >.6 
Total A%* 60.0 48.4 8.9 10.4 .639 5 
Animal P* 3.8 3.4 1.3 1.4 1.128 ».2 
Total P 4.7 4.4 1.5 1.7 -719 4 
D% 80.8 381.9 14.1 11.3 333 >.7 
FM% 22.8 20.9 7.5 9.4 866 >.8 
M% 21.6 24.8 11.1 8.0 -1.279 >.2 
F% 38.0 87.2 11.1 10.2 .290 >.7 
Combinatory %* 35.3 32.9 11.6 9.9 864 >.3 
we 52.1 60.7 19.0 9.9 -358 >.7 
H%* 27.2 26.7 7.7 7.6 .252 >.8 
Object %* 25.38 25.7 10.6 10.0 153 »8 
Cards 8, 9, 10% 34.1 33.0 8.0 6.4 590 >.6 
Total no. response 17.8 16.3 6.7 5.6 626 >.6 








*Not part of the ordinary Klopfer scoring system as 
used here; for total A% and combinatory, see text under 
“Procedure.”” Animal P is number of P seen other than 
men and bow on Card III, and skin on Card VI. The 
“object” category is expanded to inelude all nonliving 
things, plus nature concepts without plants or animals 
being specified, e.g., “landscape.” 
all those variables amenable to ¢-test analysis. 
The first seven variables were specified in our 
hypotheses. Though the ?’s are of insignificant 
magnitude, it is interesting that six of the 
seven are in the predicted direction, the excep- 
tion being D%. An attempt was made to gain 
more precision by matching subjects for total 
number of responses, as shown in Table 3, but 
again all values failed to reach the .05 level 
of significance. 

lf set produced any effect in the Rorschach, 
it should be maximal for the upper half of the 
experimental group on Task I] and minimal 
for the lower half of the control group. Even 
between these extremes none of the Rorschach 
differences even remotely approached signifi- 

'1The authors are greatly indebted to Dr. Walter 


Kaess for this and for other statistical and method- 
ological refinements throughout. 
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Table 3 


Significance of Differences on Rorschach Variables with Experimental 
and Control Groups Matched for Total Number of Responses 














Variables Variables not 

specified in t p specified in t p 
hypotheses hypotheses 
A% 214 > 8 F% — 1.280 > | 
Total A% 1.334 >.1  Combinatory % .575 >.5 
Animal P 1.067 >2 W% 829 >A 
Total P .747 >4 H% .057 >.9 
D% — .969 >.3 Object % — .547 >.5 
FM% 1.285 >.2 Cards 8, 9, 10% 405 > .6 
M% —1.124 >.2 
cance. This is an especially devastating refuta- Table 4 
tion of our hypotheses since the low members of — 


the control group not only were given no set 
toward animals but spontaneously produced 
relatively few, while the high members of the 
experimental group may be thought of as hav- 
ing a combination of set to see animals and 
a natural tendency to do so, one or both being 
strong. 

It was a tenable supposition that if set had 
produced consistency of behavior, the correla- 
tion between number of animals used during 
the tasks and 4% on the Rorschach should be 
higher for the experimental group than for the 
controls. For Tasks I and II combined versus 
A%, r was —.09 for experimental and —.26 
for control subjects. For Task II alone versus 
4%, r was —.17 for the experimental and 
—.20 for the control group. These figures are 
not significant (p>.05), but it is curious that 
the direction should be consistently negative. 

Usually we find a modest positive correlation 
between F% and 4%. If this relationship 
were destroyed among the experimental group 
but preserved among the controls, it would 
suggest that set had had a distorting effect. The 
r between F% and 4% for the experimentals 
was +.03, and for the controls +.32; again 
neither is significant, though as in the bulk 
of the data the trend is in the predicted direc- 
tion. 

When all of the Rorschach material was 
subjected to chi-square analysis, i.e., those vari- 
ables considered continuous and those not, of 
56 analyses, six were significant (p < .05) 
when fewer would have been expected by 


chance alone. These six are presented in 
Table 4. 


Chi Square and p Levels for Those Rorschach 
Variables Yielding Significant Differences 
between Groups 











Rorschach Direction for 

variables x? p exp. group 
Total number of animals 4.15 <.05* High 
Add. animal responses 4.35 <.05 Absent 
A + H in the same response 5.02 <.05 Absent 
Rare detail (dr) 5.02 <.05 Absent 
Formed color (FC) 4.01 <.05 Absent 
Unformed color (CF + C) 6.80 <.01 Present 





*Of the variables for which predictions were made only 
the total number of animals yielded a significant rela- 
tionship ; because of the one-tailed nature of the hypothe- 
ses, p may be considered >.025 for this variable. 

Although the ¢ for total number of animals 
mentioned was not significant, more experi- 
mental than control subjects were above the 
median (p < .05). This is the only straight- 
forward indication of the effect of set in all 
the material. It must be remembered, however, 
that total number of animals mentioned as de- 
fined in this study, is not a traditional Ror- 
schach variable. Unexpectedly, it was the con- 
trol subjects who were apt to use additional 
responses containing animals, and who used 
animals fused with humans in the same re- 
sponse. Thus if the experimental subjects used 
animals at all, they were likely to make them 
the main content. It is not obvious why it 
should have been the controls who used the 
rare locations (dr), unless indeed set had the 
narrowing effect predicted in Hypothesis 3. 
There seems to be no ready explanation for the 
prevalence of amorphous color among the ex- 








et 
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perimental group and formed color among the 
control group; it is tempting to attribute these 
last two to chance! 


Discussion 


The evidence seems to indicate that percent- 
age of animals seen on the Rorschach blots is 
not vulnerable to distortion by a temporary 
set, though there were a priori reasons for sus- 
pecting it to be inherently less stable than many 
other Rorschach variables. Unfortunately, this 
experimental design did not entirely rule out 
the possibility that the set, having carried over 
strongly to a second task, simply vanished in 
the face of a third task. All that we can say is 
that it was certainly present immediately prior 
to the Rorschach and not clearly demonstrable 
during it. If a fourth task had been presented 
after the Rorschach we do not know whether 
experimental and control groups would have 
been differentiated again. lt seems to be the 
nature of sets, however, to persevere and in- 
crease in strength the longer they prove success- 
ful for problem solving [10]. We find it diffi- 
cult to believe that the set to see animals 
simply dissipated itself spontaneously in the 
minutes between Task II and the Rorschach. 
All the evidence suggests, therefore, that the 
Rorschach evoked from all the subjects their 
characteristic behavior so strongly as to over- 
ride the existing set; in effect, the subject fol- 
lowed some more fundamental personal tend- 
ency, quite as Rorschach workers assume. The 
very fact that the insignificant data showed a 
persistent trend in the predicted direction ar- 
gues against the supposition that no set existed 
at the moment when the Rorschach was pre- 
sented, while the failure of the effect to reach 
any great magnitude testifies to the genuine co- 
erciveness of the Rorschach material. So far as 
this study is concerned, clinicians remain secure 
in their assumption that implicit peripheral sets 
will not influence Rorschach results to any ap- 
preciable degree. 


Summary and Conclusions 


1. College students were randomly divided 
into a control group and an experimental group, 
the latter being given a set to perceive animals 
by the Rees and Israel technique [12]. On an 
initial task (word recognition) the two groups 
were significantly differentiated and remained 
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so on a second task (word completion) with 
p < .001. Immediately thereafter all subjects 
were given the group Rorschach. 

2. It was our hypothesis that having taken 
the set, the experimental subjects would per- 
ceive more animals on the group Rorschach 
test than the controls. 

3. Despite clear indications of set in Task 
II, the Rorschach results were essentially neg- 
ative. No significant differences were found 
for any of the variables for which ¢ test was ap- 
propriate; when cases were matched for total 
number of responses the p levels were no 
higher 

4. Initially we had predicted that among 
the experimental group there would be higher 
A%, more animals mentioned, higher D%, 
more populars, more FM, but fewer M, vista, 
and diffusion responses. Although, except for 
D%, the obtained differences were in the pre- 
dicted direction, none reached statistical signi- 
ficance. 

5. When all scored variables were treated 
by chi square, of the 56 analyses only six 
achieved significance, and only two of these 
even remotely supported any of the original hy- 
potheses. 

6. All our evidence suggests that the Ror- 
schach material was genuinely coercive, evok- 
ing from the subjects their characteristic be- 
havior and overriding a strongly established 
pre-existent set. So far as this study is con- 
cerned, Rorschach workers remain secure in 
the assumption that implicit peripheral sets 
will not influence test results to any appreciable 
extent. 


Received August 25, 1954. 
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Inter-Judge Agreement on Traits Rated 
from the Rorschach’ 


Leonard Gelfand, Bruce Quarrington, Harley Wideman, and 


Jean Brown 
Department of Psychiatry, University of Toronto 


One concomitant of the recent trend toward 
global validation of projective techniques is 
an attempt to utilize clinicians’ judgments in 
the design of validation studies. Hence it is 
necessary to establish the extent of agreement 
between judges. This paper reports a study to 
determine the correlation between the ratings 
of six clinicians who rated 36 patients on 10 
traits from the Rorschach. 

All six raters were skilled clinical psycholo- 
gists, had received recognized Rorschach train- 
ing and supervision, and had from four to eight 
years experience with the technique. They par- 
ticipated in the research design and hence knew 
the purpose for which they were rating the 
subjects. 


The 36 patients were selected by the raters 
from the closed case file at the Toronto Psychi- 
atric Hospital. One-half were inpatients and 
one-half outpatients. The age range was from 
17 to 62 with a mean of 30.03 and a median 
of 27.5 years. The subjects fell into the fol- 
lowing diagnostic categories: anxiety state, 8; 
depression, 3; obsessive-compulsive, 1; undif- 
ferentiated psychoneuroses, 7; adolescent be- 
havior disorder, 2; undifferentiated schizophre- 
nia, 6; preschizophrenic disorders, 4; paranoid 


1An extended report of this study, which was fi- 
nanced by a Provincial-Federal Health Grant, may 
be obtained without charge from L. Gelfand, 241 
Elizabeth St., Toronto, Ontario, or for a fee from 
the American Documentation Institute. To obtain it 
from the latter source, order Document No. 4397 
from ADI Auxiliarv Publications Project, Photo- 
duplication Service, Library of Congress, Washing- 
ton 25, D. C., remitting in advance $1.75 for micro- 
film or $2.50 for photocopies. Make checks payable 
to Chief, Photoduplication Service, Library of Con- 
gress. 


schizophrenia, 1; affective psychoses (manic- 
depressive, psychotic depression, involutional 
disorders), 4. 

The 10 traits rated were: desire to gain 
insight, feeling of physical inadequacy, dis- 
satisfaction with self, impulsivity, dissatis- 
faction with life situation, basic intellectual po- 
tential, assertiveness, present intellectual func- 
tioning level, identification with social role of 
own sex, and use of wish-fulfilment fantasy. 
Each trait was rated on a 10-point scale. 

Patients were rated individually by each 
judge. The information available to the judge 
was the complete, scored Rorschach, the age 
and sex of the patient, whether he was a hos- 
pital or clinic patient, and which of their col- 
leagues administered and scored the Rorschach. 

Product-moment correlations were calcu- 
lated for the ratings of each judge with every 
other judge on each trait. The 150 intercorre- 
lations ranged from —0.116 to +0.716 with 
the estimated median about 0.30 and nine neg- 
ative correlations. 

The correlations are judged as too low to 
indicate consistency or agreement between the 
raters’ judgments and hence no tests of signifi- 
cance have been applied. It seems reasonable to 
conclude that the judges disagree in their rat- 
ings of the patients. The disagreement may be 
the result of three factors: (a) the hetero- 
geneity of the patient sample, (4) the com- 
plexity of the material judged, and (c) the 
lack of adequate definition of trait terms 
coupled with the use of differential Rorschach 
criteria for judging these traits. 

Brief Report 
Received September 13, 1954. 
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Books 


Anastasi, Anne. Psychological testing. New York: 
Macmillan, 1954. Pp. xiii + 682. $6.75 


The field of testing has grown so rapidly that 
most contemporary books on psychological testing 
are either extensive and superficial, or intensive and 
incomplete. Anastasi’s new college text achieves a 
nice compromise between these extremes. This is not 
a compendium of present-day tests, each briefly de- 
scribed and briefly criticised. Neither is it a recapit- 
ulation of the manuals of a few widely used tests. 
It is a carefully planned and integrated volume, 
which begins with a consideration of the principles 
of psychological testing, proceeds to employ carefully 
selected examples of tests of general classification, 
aptitude, and achievement, and ends with measures 
of personality characteristics ranging from inven- 
tories through projective techniques and situational 
tests. In every instance, the instruments selected for 
illustration are representative, and their advantages 
and shortcomings are handled critically. Although 
the book is primarily a college text, it cannot fail to 
attract the attention of practicing psychometricians 
and clinicians. Particularly important for both stu- 
dents and practitioners are the sections on the ethics 
of control of tests, on test validity, on the limita- 
tions of infant and preschool tests, and on the pe- 
culiar characteristics of projective techniques and 
situational tests. — A.M.G. 


Beck, Samuel J. The six schizophrenias, reaction pat- 
terns in children and adults. Research Monograph 
No. 6. New York: American Orthopsychiatric 
Association, 1954. Pp. xi + 238. $5.00. 


This research monograph reports an _ interdis- 
ciplinary study of outpatient schizophrenics, based 
on systematic clinical descriptions by psychiatrists 
and Rorschach interpretations by psychologists. The 
major conclusions come from three inverse factor 
analyses of Q sorts of 120 descriptive items. One fac- 
tor analysis was of 12 patients, 7 adults and 5 chil- 





Note.—The reviews were prepared by the Editor 
and Associate Editors, who may be identified by 
their initials. 
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dren, on whom the psychiatrists and psychologists 
agreed; a second was of the 8 patients, 5 adults and 
3 children, on whom they disagreed. The third in- 
verted factor analysis was of 20 children, “some” 
(number unreported) of whom were included in the 
two preceding analyses. The first analysis revealed 
three types of schizophrenia, circumspectly labeled 
S-1, S-2, and S-3, which might be characterized 
briefly as intellectually disrupted, affectively dis- 
turbed, and rigidly controlled. The second analysis, 
of the eight “disagreement” cases, disclosed two more 
types, SR-1 and SR-2, which were differentiated 
only by Korschach patterns, not by Clinical descrip- 
tions. The analysis of the children revealed one 
turther type, S-G. Studies are aiso reported of the 
Comparisol Of ihe NOrscoach scores Of schizoporenics, 
heurolics, aod hormai Coitrois, Of the Geveiopment 
of schizophrenia in chiidren based on Korschacas 
2 to 12 years apart, and ot enviroumental conditions 
associated with schizophrenia. Aithough the re- 
Search uses reiresMilip:y Mmgibai approaches and 
represenis ah Ulpress.y<¢ lMouunt OF iabOor, it Talis 
short OF achievilig its polehuaiiues. First, snsuficent 
use is made of the normai coutroi data. One would 
like, tor exampie, to know how many Rorschachs of 
clinically normal persons would fail into one of the 
“schizoparenic types. Second, the data are not al- 


Ways opumaliy reporied or anaiyzed. As one in- 


Stance, the critcria ior Choosing the crucial “agree- 
meni’ and “disagreement” groups of patients are 
not given, and the reviewer’s digging into correla- 
tion matrices in an appendix unearthed the distress- 
ing ob ation that the psychiatrists and psycholo- 
gists s Li agreement on 3 of the 8 “dis- 
acreem« cases tia on 3 of the 12 “agreement” 
on Keliability is treated with more optimism than 


daia; again, the appended tabies reveal that the cor- 
relations between the Q sorts made by two Rorschach 
interpreters for 15 cases ranged from .01 to .65 with 
a median of .50, certainly not an impressive scoring 
reliability. Third, the need for cross validation is not 
recognized. Fourth, in spite of the author’s pleas for 
idiographic research, there is question of the gener- 
ality of findings based mainly on three inverse factor 
analyses of 12, 8, and 20 cases which included only 
12 adults, all outpatients. The study makes con- 
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tributions to method and proposes useful hypotheses 
for further research. It almost surely does not, as the 
title implies, identify the six schizophrenias. — 
LFS. 


Burt, Cyril. The causes and treatment of backward- 
ness. New York: Philosophical Library, 1953. Pp. 
128. $3.75. 


The author describes this book as an abridgment 
and revision of his earlier The Backward Child of 
fifteen years ago. The new volume is primarily con- 
cerned with educationally substandard children. We 
think it regrettable that the “educationally” was not 
incorporated in the new title, since many readers 
will anticipate that the book deals with the mental- 
ly backward. Actually there is more need for ex- 
position of the multiple causes of scholastic retar- 
dation than of mental subnormality, and the more 
specific title would have drawn a wider audience 
and supplied a more useful purpose. We also wish 
that the author might have elaborated the meaning 
of “educational” backwardness, since slow learning 
has directions as well as degrees. We regret this 
more in the light of the author’s authority, insight, 
and experience which give this exposition both 
breadth and merit, and which in spite of beguiling 
simplicity, impresses us as excellent comprehension 
of children’s learning barriers. — E.A.D. 


Cronbach, Lee J. Educational psychology. New 
York: Harcourt, Brace, 1954. Pp. xxvii + 628. 
$7.50. 


Cronbach has made an excellent integration of the 
resources of psychology with the needs of education. 
Psychologists will like this book because of its sound 
but simple handling of evidence and theory; edu- 
cators will like it because of the clear awareness of 
the realities of the classroom. Students will enjoy 
its lucid writing and attractive format. — L.F.S. 


English, O. Spurgeon, & Finch, Stuart M. Introduc- 
tion to psychiatry. New York: Norton, 1954. Pp. 
viii + 621. $7.00. 


Intended as a beginning text for medical students, 
Introduction to Psychiatry can serve as well to ori- 
ent graduate students in psychology and related so- 
cial sciences to modern psychiatry. Avowedly “dy- 
namic” and Freudian, the treatment in most of the 
volume is essentially eclectic. It follows rather close- 
ly the Classification of Mental and Emotional 
Diseases adopted by the American Psychiatric As- 
sociation in 1952. In addition to the systematic 
reatment of personality disorders, there are excellent, 
succinct chapters on personality development and 
structure, child psychiatry, principles of psycho- 
therapy, and mental hygiene. The chapter ‘on mental 
deficiency is weak. A tremendous amount of psy- 
chiatric material is covered in this text, most of it 
necessarily brief — perhaps too brief for reference 
purposes. It was designed, however, as an “intro- 
duction,” and as such is excellent. — M.K. 


Hodgson, Kenneth. The deaf and their problems. 
New York: Philosophical Library, 1954. Pp. xx + 
348. $6.00. 


The author has divided the book into three sec- 
tions. The first part is on the ear and the mechanism 
of hearing. Part two is a historical survey of the 
problems of the deaf from ancient and medieval 
times. Part three is a discussion of the problems of 
the deaf in the twentieth century. Because we often 
learn from the mistakes of the past, the book fur- 
nishes excellent background material for one who is 
interested in the problems of the deaf and in mod- 
ern-day solutions. The author strongly favors the 
view that the conflict between the oral and silent 
methods of teaching the deaf should have been al- 
lowed to die out; there should be experimental effort 
on an international scale on the correct teaching of 
the deaf. While work in America is mentioned, the 
author seems much better informed on the methods 
used in Europe and the British Isles. The book is 
strongly oriented toward English problems and 
schools, — B.M.L. 


Hurlock, Elizabeth B. Dewelopmental psychology 
New York: McGraw-Hill, 1953. Pp. ix + 556. 
$6.00. 


“The purpose of this book is to give as complete a 
picture of the developmental! changes of the total life 
span of the human being as is possible within the 
two covers of one book.” These changes are pri- 
marily psychological, with only sketchy reference 
to the medical and social aspects. This volume ap- 
parently telescopes material from the author’s earlier 
successful books on child and adolescent develop- 
ment, with the addition of chapters on adulthood 
and old age. The contents are impressive for both 
generalization and detail. The coverage is broad and 
eclectic yet skillfully discriminative. Obviously no 
serious depth could be included throughout, but one 
gets the impression of an adequate and scholarly 
survey of a field which has rapidly reached its own 
fruitful maturity. — E.A4.D. 


Lazarsfeld, Paul F. (Ed.) Mathematical thinking in 
the social sciences. Glencoe, Ill.: Free Press, 1954. 
Pp. 444. $10.00. 


These papers, by eight authors, were selected as 
examples of situations which might typically devel- 
op between the social sciences and mathematics. This 
is certainly an important book, but most of it is ex- 
tremely difficult for the social scientist with no more 
than the “modicum of training in mathematics” sug- 
gested by the editor as sufficient. Anderson develops 
a model for analyzing changes in attitudes over 
time on the basis of analysis of stochastic processes. 
Its application is demonstrated with a panel study 
of potential voters who were interviewed six times 
during an election campaign. More general models 
are also developed. Rashevsky expresses basic prob- 
lems in interpersonal contacts in terms of differen- 
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tial equations, His first paper deals with imitative 
behavior and the second with the distribution of 
status and its alteration through repeated contacts. 
Coleman gives a further explication of Rashevsky’s 
models, including two others in his analysis, and 
shows specific relations between the mathematical 
elements of the models and the elements of the con- 
crete social situations. Marschak discusses the role 
ot probability theory in the social sciences with em- 
phasis on “subjective probability.” As an example of 
the relation between probabilistic models and social 
problems he analyzes the effect of a government tax 
policy. Guttman’s first contribution is a develop- 
ment of the psychological implications of the mathe- 
matics of his scalogram analysis, already well known 
to psychologists. The second presents a generalized 
form of factor analysis. Lazarsfeld’s paper concerns 
the relations between social research procedures and 
the basic equations of latent structure analysis. Si- 
mon contributes a general discussion of model 
building in the social sciences. — A.R. 


Lorand, Sandor. (Ed.) The yearbook of psycho- 
analysis. New York: International Universities 
Press, 1954. Pp. 350. $7.50. 


The ninth volume of the Yearbook resembles its 
predecessors in furnishing a range of papers, re- 
printed from other sources, representing both the- 
oretical and clinical contributions to psychoanalysis. 
While many of the writings expand, interpret, or 
paraphrase Freud, the total sweep of the volume is 
such that a diversity of contemporary viewpoints is 
provided. In the reviewer's opinion, the papers by 
the Bernfelds (“Freud’s First Year in Practice’), 
by Klein (“The Origins of Transference”), by 
Szasz (“On the Psychoanalytic Theory of In- 
stincts”), by Lewin (‘““Phobic Symptoms and Dream 
Interpretation”), by Brewster (“Separation Reaction 
in Psychosomatic Disease and Neurosis”), by Ran- 
gell (“The Analysis of a Doll Phobia”), and by 
Roheim (“The Evil Eye”) illustrate most clearly 
the current developments and controversies selected 
for inclusion. The volume makes available in con- 
venient form a great many contributions to psycho- 
analysis which otherwise might well go unnoticed 
by those not directly in the analytic field—A.M.G. 


Pickford, R. W. The analysis of an obsessional. 
New York: Norton, 1954. Pp. xii + 234. $4.00. 


Valuable because of the extensive material it pre- 
sents on a single case, this book is a disappointment 
on three scores. First, the record of the analytic in- 
terviews is a reconstruction from the therapist’s 
notes, not an electrical transcription. Consequently, 
in spite of the richness of the reported observations, 
readers must wonder about the extent to which they 
are colored by the analyst’s preconceptions and the- 
oretical orientation. Pickford’s remarkable statement 
that “The real remedy for the supposed weakness 
in this lack of ‘objective’ recording is surely to be 
psycho-analysed oneself,” made in defense of his 


not taking a verbatim record, represents the second 
source of disappointment, The book has the tone of 
a defensive dogmatism that discourages critical 
thought and tends to stereotype the insightfulness 
that case presentations can often elicit. Third, al- 
though it is written by a distinguished psychologist 
with a brilliant record in experimental fields, The 
Analysis of an Obsessional shows no sign of its 
author’s professional identity. ‘The potential contri- 
butions of psychology appear to have been thorough- 


ly forgotten in favor of a wholesale capitulation to 
psychoanalytic orthodoxies, While this volume may 
have heuristic importance as an illustration of one 


way of dealing with a particular kind of psycho- 
therapeutic problem, it seems quite lacking in novel- 


ty, inventiveness, and the refreshing tough-minded- 
ness that one can legitimately demand from this 
kind of literature, especially when prepared by an 


outstanding psychologist. — £.J.S. 


Sarason, Seymour B. The clinical interaction, with 
special reference to the Rorschach. New York: 
Harper, 1954. Pp. x + 425. $5.00. 


Sarason has produced a unique volume and a most 
stimulating one. The book is intended to serve as:a 
text for instruction in the Rorschach and was de- 
veloped from the author’s classes. But it is much 
more than a Rorschach book. Its thesis is that the 
Rorschach cannot be understood in isolation, but 
only in terms of the interpersonal interactions of the 
clinical situation. The first eight chapters, therefore, 
do not deal with the test at all, but with the vari- 
ables inherent in interpersonal relations — the stim- 
ulus task, the time and place, the clinician as a per- 
son, his age and sex, and the attitudes and expecta- 
tions of the client. In discussing these variables, Sar- 
ason draws upon experimental studies as well as 
case illustrations from social psychology, psycho- 
therapy, and psychodiagnostics. Only on page 109 
does the student meet the Rorschach itself, in a well- 
written series of chapters on instructions, location, 
color, shading, movement, form, content, and in- 
tegrated interpretation. Extensive use is made of 
research findings; no other work on the Rorschach 
supports its doubts and claims with so much data. 
The negative nature of much of the evidence is 
faced frankly. The text concludes with an analysis 
of six cases, designed to illustrate the processes of 
inference by which the psychologist draws his con- 
clusions. The volume quite evidently leaves its au- 
thor conflicted — a feeling that will be shared by 
many readers. When we remove from the Rorschach 
all of the beliefs controverted by evidence, the re- 
maining substance is thin. Instructors who like their 
students to think, instead of to be indoctrinated in 
a ritual, may well consider this book seriously. 
—L.F.S. 


Stolz, Lois Meek. Father relations of war-born chil- 
dren. Stanford, Calif.: Stanford Univer. Press. 
1954, Pp. viii + 365. $4.00 (Paper). 


This study of the adjustments of fathers and their 





een 





y™ SO S—‘<‘i ‘ ?” 


~~ “" 











New Books and Tests 475 


first-born children to stress determined by the return 
of the father from overseas is outstanding for its 
testing of clinically conceived hypotheses by means 
of objective and appropriate methods. Nineteen fami- 
lies, separated by war during the first pregnancy of 
the mother and reunited after the first child was 
at least one year old, constitute the major group for 
study. These families are compared with control 
groups of families in which separation did not oc- 
cur. The major hypothesis of the study is that the 
war-separated families show adjustment difficulties, 
particularly between father and child, and that the 
father’s attitude affects the development of the child 
adversely. Methods of testing this and other hy- 
potheses were devised with care, and included in- 
terviews with fathers and mothers, free observations 
of children with peers and adults, and studies of 
children in projective play situations, such as ag- 
gression with balloons and blocking, doll play, story 
completions, and dramatic play completions. The 
major and minor results of the study are numerous 
and highly significant for our understanding of per- 
sonality development in early infant-parent inter- 
action. The dynamic pattern of father-child rivalry 
for the mother, withdrawal by the child, and con- 
sequent paternal distress shown in the development 
of authoritarian attitudes, for example, provides an 
important contribution to our knowledge of how 
adjustive techniques evolve. — A. M. G. 


Sullivan, Harry Stack. The psychiatric interview. 
New York: Norton, 1954. Pp. xxiii + 246. $4.50. 


In this volume Otto Will has organized some of 
the lectures and writings on interviewing by the late 
Harry Stack Sullivan. Dr. Will has done an excel- 
lent editorial job and contributed a helpful intro- 
duction. Sullivan’s comments are both learned and 
lively. He thoroughly understood the interpersonal 
dynamics of interviewing, and writes of it with 
evident enjoyment. His style is penetrating and salty, 
combining a feeling for the theoretical significance 
with a flare for earthy interpretation. The clinician 
who reads this book will learn a lot about inter- 
viewing and a lot about Harry Stack Sullivan, and 
he will have a delightful time doing it. — W. A. H. 


Swartz, Harry. The allergic child. New York: Cow- 
ard-McCann, 1954. Pp. 297. $3.95. 


Most of us are perfunctorily alerted to the dis- 
comforts and annoyances of allergic idiosyncracies. 
But relatively few laymen appreciate the multiple 
consequences, both direct and secondary, associated 
with such morbid sensitivities. Hence this book may 
well impress some readers as rabid and even propa- 
gandistic. There is, indeed, an initial aura of patent- 
medicine advertising that creates fear or at least 
skeptical apprehension which then changes to earnest 
concern. The widespread ramifications of body- 
system damage are perhaps less understood than the 
dynamic, behavioral, and learning sequelae. The re- 
viewer is unequal to appraising the strictly somatic 





significances, but anyone touched with allergy or as- 
sociated with allergy-prone children will acquire a 
more informed and determined view of this per- 
vasive distress. The child-guidance therapist will 
find herein one more and highly significant desider- 
atum in his appraisals of the problems of children 
and adults. — E. A. D. 


Westwood, Gordon. Society and the homosexual, 
New York: Dutton, 1953. Pp. 191. $3.00. 


“This is not a medical treatise, but an attempt to 
evaluate the social implications of homosexuality”. 
. . between two adult 
men. It does not include infantile and early pubertal 
homosexuality . . 


It “includes sexual activities . 


. and it does not include the pseu- 
do-homosexuals or the infanto-homosexuals.” Its 
“case,” as noted by Dr. Edward Glover's introduc- 
tion, is “in brief, that homosexuality ... is not... 
the lecherous perversion of self-indulgent degener- 
ates, but . . . a powerful unconscious force to which, 
in other forms, civilization owes much of its strength 
and some of the greatest of its achievements.” In 
short it offers a sympathetic exposition of the more 
recently tolerant and analytically insightful interpre 
tations without, it seems to the reviewer, adding 
materially thereto. — E. A. D. 
Test 

Andrew, Gwen, Hartwell, Samuel W., Hutt, Max 

L., & Walton, Ralph E. Michigan Picture Test. 

Ages 8-14. Individual or small-group test. 1 form. 

Untimed. Set of 16 plates ($6.00) ; analysis sheets 

($1.60 per 20) ; Manual, pp. xxiii + 108, ($2.00). 

Also, Rating Scale for Pupil Adjustment ($1.15 

per 20); with manual, pp. 4; specimen set ($9.00). 

Chicago: Science Research Associates, 1953. 

The Michigan Picture Test is a new appercep- 
tion test for children 8 to 14 years of age, developed 
through seven years of research by the Michigan 
Department of Mental Health. No other picture- 
story test has been accompanied by so much data 
relevant to its standardization and validation. The 
test may be used in several ways. For screening, four 
“core” pictures are used, which represent a family 
situation, a peer-group situation, lightning at night, 
and a blank card. Responses to these four plates may 
be scored for variables validated against a criterion 
of teacher ratings of adjustment — tension index, 
verb tense, and direction of forces — from which a 
combined maladjustment index may be obtained. 
Scoring reliability is good, but there are no data 
on subject response reliability. The analysis sheet 
provides space for recording four more variables 
quantitatively — interpersonal relations, personal 
pronouns, popular objects, and psychosexual level — 
but users are warned that these scores bore no signif- 
icant relation to the criterion. An entire series of 
twelve plates, four of which have separate cards 
for boys and girls, may be used for full clinical 
study. The well-prepared manual contains an essay 
by Hutt on the psychology of picture tests, and a full 
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account by the other authors of the selection of the 
pictures and the validation and standardization of 
the test. The Michigan pictures provide a useful 
clinical tool for an age group hitherto ignored in 
the construction of apperception tests.—L. F. S§. 
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